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ABSTRACT 

While  Lanchester's  equations  are  commonly  used  as  the  basis  for  force-on-force  combat  models,  it 
is  important  to  remember  that  Lanchester's  Equations  are  not  a  model  of  combat,  only  a  model  for 
combat  attrition.  There  have  been  numerous  attempts  to  compare  historical  combat  data  with  the 
behaviour  expected  from  Lanchester's  Equations.  The  present  work  extends  this  comparison 
between  historical  battle  data  with  behaviour  expected  from  a  battle  where  attrition  is  described 
by  Lanchester's  Equations.  It  examines  how  analyses  of  historical  battles  can  contribute  to  the 
development  of  models  of  combat  and  hence  our  understanding  of  combat  in  addition  to  the 
processes  used  in  the  creation  of  databases  of  historical  battle  results.  The  historical  data  is 
compared  against  the  expectations  of  both  the  deterministic  and  stochastic  forms  of  Lanchester's 
Square  Law. 
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Applications  of  Historical  Analyses  in  Combat 

Modelling 


Executive  Summary 

Lanchester's  equations  are  commonly  used  as  the  basis  tor  force-on-force  combat  models, 
even  if  only  as  a  metamodel  for  a  more  complex  combat  model.  It  is  important  to 
remember  that  Lanchester's  Equations  are  not  a  model  of  combat,  only  a  model  for  combat 
attrition.  The  equations  alone,  therefore,  cannot  be  expected  to  capture  other  effects  such 
as  the  movement  of  engaged  forces.  There  have  been  numerous  attempts  to  compare 
historical  combat  data  with  the  behaviour  expected  from  combat  models.  To  validate 
differential  models  of  attrition,  force  and  casualty  numbers  for  both  sides  intermediate  to 
the  starting  and  finishing  values  are  required.  That  level  of  detail  is  rarely  available  and 
often  does  not  exist.  An  alternate  approach  using  only  the  initial  and  final  values  of 
engaged  force's  strength  has  been  tried  previously.  However,  Lanchester's  Equations 
describe  the  behaviour  of  a  single  system  in  time  while  the  historical  databases  contain 
information  about  an  ensemble  of  battles,  each  potentially  with  different  values  of  attrition 
rate  coefficients.  The  issue  of  why  the  results  from  such  an  ensemble  follow  the  behaviour 
expected  from  Lanchester's  Equations  has  never  been  adequately  explored  or  explained. 

The  present  work  extends  this  comparison  between  historical  battle  data  with  behaviour 
expected  from  a  battle  where  attrition  is  described  by  Lanchester's  Equations.  It  examines 
how  analyses  of  historical  battles  can  contribute  to  the  development  of  models  of  combat 
and  hence  our  understanding  of  combat  in  addition  to  consideration  of  the  processes  that 
are  used  in  the  creation  of  databases  of  historical  battle  results.  The  implications  of  those 
processes  on  the  limitations  of  this  form  of  analysis,  the  constraints  they  impose  and  the 
resulting  inherent  biases  are  discussed,  as  well  as  methods  that  can  be  used  to  quantify 
and  mitigate  their  effects. 

The  historical  data  is  compared  against  the  expectations  of  both  the  deterministic  and 
stochastic  forms  of  Lanchester's  Square  Law.  However,  it  should  be  noted  that 
examination  of  Lanchester's  stochastic  differential  equation  was  not  intended  to  be 
comprehensive  or  rigorous.  Both  have  been  covered  extensively  elsewhere,  including  by 
the  author,  and  the  present  work  contains  numerous  references  to  more  authoritive  works 
on  these  subjects  for  the  interested  reader. 

Finally,  evidence  for  considering  battle  as  a  particular  type  of  complex  adaptive  system, 
one  that  involves  co-evolution  and  scale  free  behaviours,  is  examined.  It  is  proposed  that 
this  may  be  responsible  for  the  unexpected  observation  that  the  behaviour  of  several 
parameters  used  to  characterise  combat  is  the  same  for  both  an  ensemble  of  different 
battles  and  for  the  evolution  of  a  single  battle. 
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a,b,c,d 

constant  attrition  coefficients 

A(x,y),  B(x,y) 

attrition/  drift  coefficient  functions 

a,P 

Equation  of  state  parameters 

Cl,  02 

constant  variance  coefficients 

Si(x, y),  S2(x,y) 

variance  rate  functions 

s 

Arbitrary  Exponents 

dx/dt,  dy/dt 

Rate  of  change  of  force  strength  with  time 

f(  ),  g(),  F() 

Arbitrary  functions 

cp/du 

partial  derivative  of/with  respect  to  variable  u 

In 

natural  logarithm 

x(t),  y(t) 

Strength  of  forces  at  time  t 

xo  ,yo 

Initial  force  strengths 

P 

Correlation  coefficient 

x,y 

Ratio  of  forces  current  to  original  strength 

V,P 

Defenders  Advantage  Parameters 

N(  t) 

Arbitrary  force  strength 

c(t) 

Arbitrary  force  casualties 

V 

frequency 

Entropy 

S(v) 

Frequency  spectral  density 

R 2 

Coefficient  of  determination 

Zl,  z2 

stochastic  functions 

dz 

Wiener  Process 

ACW 

American  Civil  War 

LSR 

Least  Squares  Regression 

MLE 

Maximum  Likelihood  Estimator 

QJM 

Quantified  Judgement  Model 

<  > 

Expectation  value  of  the  object  within  the  brackets 

UNCLASSIFIED 


DSTO-TR-2643 


UNCLASSIFIED 


This  page  is  intentionally  blank 


UNCLASSIFIED 


UNCLASSIFIED 


DSTO-TR-2643 


1.  Introduction 


During  the  First  World  War  F.  W.  Lanchester  described  one  of  the  simplest,  and  most 
enduring,  mathematical  attrition  models  of  force-on-force  combat  [1],  Fie  proposed  two 
systems  of  equations,  depending  on  whether  the  fighting  was  "collective"  or  not.  Collective 
combat  between  an  attacking  side  of  strength  x  and  a  defending  side  of  strength  y  being 
described  by  the  equations: 


dx  ,  . 
—  =  -ay(t), 
dt 

=  ~bx(t), 
dt 


x(0)  =  x0 


v(0)  =  y0 


(1) 


which  result  in  the  equation  of  state: 


(4 -x2)  =  a 

Wy)~b 


(2) 


The  quadratic  form  of  which  results  in  this  system  of  equations  being  known  as  the 
Lanchester  Square  Law.  Individual  combat  on  the  other  hand  is  described  by  equations  that 
produce  an  equation  of  state  in  which  the  force  strengths  are  related  linearly.  The  question  of 
how  each  side's  strength  is  to  be  measured  is  deferred  until  the  instantiation  of  a  historical 
battle  database  is  considered. 


The  assumption  of  "collective"  combat  is  unlikely  to  apply  throughout  an  entire  battle  and 
hence  real  world  attrition  results  from  a  combination  of  collective  and  individual  combats.  This 
possibility  has  long  been  recognised  and  produced  many  attempts  to  generalise  Lanchester's 
system  of  equations  to  better  represent  actual  combat  results. 

Lanchester's  model  was  developed  as  a  description  of  air  combat,  in  which  each  side  was 
essentially  composed  of  a  single  type  of  combat  element.  Force  strength  was  then  a  simple 
matter  of  counting  the  number  of  aircraft  in  a  side.  Modern  applications  of  Lanchester's  ideas 
to  land  combat  run  into  the  problem  that  each  side  consists  of  a  number  of  types  of  combat 
element  (infantry,  artillery,  tanks  etc.)  each  of  which  interacts  differently  with  each  of  the 
opposing  sides'  combat  types.  The  development  of  heterogeneous  combat  models  is  central  to 
most  current  military  combat  simulations  [2],  It  is  important  to  remember  that  Lanchester's 
Equations  are  not  a  model  of  combat,  only  a  model  for  combat  attrition.  The  equations  alone, 
therefore,  cannot  be  expected  to  capture  other  effects  such  as  the  movement  of  engaged  forces. 
This  is  frequently  forgotten,  as  by  Epstein  [3]. 

There  have  been  numerous  attempts  to  compare  historical  combat  data  with  the  behaviour 
expected  from  Lanchester's  Equations,  including  the  work  of  Helmbold  [4]  and  Hartley  [5]. 
Hartley  also  includes  a  comprehensive  review  of  the  effort  to  validate  combat  attrition  laws 
using  historical  analysis.  Recent  work  by  the  author  has  also  investigated  the  ability  of 
Lanchester's  Equations  to  describe  patterns  observed  in  the  casualty  statistics  using  Hartley's 
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database  of  historical  battles.  This  includes  an  examination  of  the  inclusion  of  a  fractal  model 
of  spatial  dispersion  on  casualty  values  [6]  and  the  distribution  of  casualties  when 
Lanchester's  Equations  are  modelled  as  stochastic  processes  [7], 

1.1  Report  Overview 

The  present  work  seeks  to  extend  this  comparison  between  historical  battle  data  with 
behaviour  expected  from  a  battle  where  attrition  is  described  by  Lanchester's  Equations.  It 
begins  with  an  examination  of  how  analyses  of  historical  battles  can  contribute  to  the 
development  of  models  of  combat  and  hence  our  understanding  of  combat.  This  is  followed 
by  consideration  of  the  processes  that  are  used  in  the  creation  of  databases  of  historical  battle 
results.  The  implications  of  those  processes  on  the  limitations  of  this  form  of  analysis,  the 
constraints  they  impose  and  the  resulting  inherent  biases  are  discussed,  as  well  as  methods 
that  can  be  used  to  quantify  and  mitigate  their  effects.  This  is  followed  by  a  brief  review  of 
how  historical  battle  results  have  been  and  may  be  used  to  validate  proposed  attrition 
relationships,  including  examination  of  the  presence  of  bias  in  the  database  using  a  sub¬ 
sampling  approach. 

Next,  the  author's  previous  work  examining  Lanchester's  Equations  modelled  as  stochastic 
processes  is  revisited  and  extended.  However,  it  should  be  noted  that  examination  of 
Lanchester's  Equations  and  stochastic  differential  equation  presented  in  the  present  work  is 
not  intended  to  be  comprehensive,  rigorous  or  complete.  Both  have  been  covered  extensively 
elsewhere,  including  by  the  author,  and  the  present  work  contains  numerous  references  to 
more  authorative  works  on  these  subjects  for  the  interested  reader  or  reader  unfamiliar  with 
the  use  of  Lanchester's  Equations. 

The  author's  previous  analyses  of  Lanchester's  Equations  using  historical  battle  data  is  then 
revisited,  using  the  larger  database  developed  for  the  present  work.  Finally,  evidence  for 
considering  battle  as  a  particular  type  of  complex  adaptive  system,  one  that  involves  co¬ 
evolution  and  scale  free  behaviours,  is  examined.  It  is  proposed  that  this  may  be  responsible 
for  the  unexpected  observation  that  the  behaviour  of  several  key  parameters  used  to 
characterise  combat  is  the  same  for  both  an  ensemble  of  different  battles  and  for  the  evolution 
of  a  single  battle. 


2.  Modelling  Paradigms  and  their  Application  in 

Combat  Modelling 

At  its  most  basic,  a  model  refers  to  a  conceptual  representation  of  some  aspect  of  reality. 
Models  are  used  when  they  are  easier  to  understand  than  those  aspects  of  reality  they 
represent.  Complex  phenomena  often  require  complex  models  if  the  model's  behaviour  is  to 
reproduce  that  of  the  real  world.  However,  while  such  models  produce  reasonable  agreement 
with  real  world  results,  they  are  less  often  useful  in  understanding  the  functional  dependence 
of  the  modelled  quantity  on  the  input  parameters.  In  such  cases  it  is  useful  to  develop  a 
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(simpler)  model  of  that  model  which,  although  providing  lower  fidelity  results,  is  better  at 
explaining  the  causes  of  those  results  [2], 

Models  can  be  classified  into  three  descriptive  types  [8],  according  to  the  degree  of  abstraction 
required: 

•  iconic  models  such  as  drawings  or  miniatures, 

•  analogue  models  in  which  other  physical  relationships  represent  the  relationships 
under  study,  and 

•  symbolic  models  where  abstract  symbols  or  quantities  are  used  to  describe  the  real 
world. 

Symbolic  models  are  in  turn  divided  into  conceptual  and  mathematical  models.  Conceptual 
models  include  descriptions,  plans  and  diagrams  including  charts.  Mathematical  models 
represent  reality  through  quantitative  relationships.  They  are  further  divided  into  static  or 
dynamic  depending  on  whether  the  model  allows  its  variables  to  change  over  time  or  not. 
Mathematical  models  can  also  be  classified  as  analytic  or  simulation  depending  on  whether  an 
exact  closed  form  solution  exists  or  whether  a  related  sequence  of  models  are  used  to 
converge  to  a  solution  for  a  complex  problem.  Simulation  models  are  also  divided  into 
deterministic  and  stochastic  models  depending  on  whether  uncertainty  and  risk  is  explicitly 
represented. 

As  usual  a  trade-off  is  involved  when  deciding  which  type  of  model  to  develop  for  a 
particular  situation  [9].  Simulation  models  capture  real  world  complications  better  than 
analytic  models.  Analytic  models  can  typically  be  solved  under  restricted  conditions  of 
population  size  or  time,  and  are  generally  poor  at  describing  transient  behaviour.  The  most 
useful  analytic  models  generally  describe  the  real  world  using  simple  functional  relationships. 

There  are  two  basic  paradigms  for  developing  mathematical  models,  including  models  of 
combat. 

2.1  Reductionist 

The  reductionist  approach  attempts  to  describe  a  complex  system  by  reducing  the  system  to 
the  interactions  of  its  parts  [10],  The  component  parts  may  also  be  reduced  to  simpler  or  more 
fundamental  objects  and  the  interactions  between  them.  A  complex  system  is  viewed  as  no 
more  than  "the  sum  of  its  parts"  in  which  all  phenomena  can  be  explained  in  terms  of  other, 
more  fundamental,  phenomena.  Reductionism  does  not  exclude  the  possibility  of  "emergent 
behaviour"1,  but  does  believe  that  such  behaviour  can  be  explained  in  terms  of  the 
phenomena  from  which  they  emerge.  This  paradigm  is  attractive  to  model  developers  as  a 
complex  model  is  obtained  simply  through  the  aggregation  of  simpler  models  of  its 
fundamental  behaviours. 


1  In  the  present  work  emergent  behaviour  is  the  way  complex  systems  and  patterns  occur  from  many  relatively 
simple  interactions. 
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Taylor  [11]  has  produced  the  most  readily  accessible  and  comprehensive  reductionist  treatise 
on  force-on-force  level  Lanchester  type  models  of  attrition.  It  includes  a  theoretical  treatment 
of  differential  equation  models  of  attrition  in  force-on-force  combat  operations,  providing 
both  an  introduction  to  and  current  overview  of  such  models  as  well  as  a  comprehensive  and 
in-depth  treatment  of  them.  Both  deterministic  as  well  as  stochastic  models  are  considered. 
However,  the  resulting  simplicity  of  a  force-on-force  level  model  limits  their  practical 
application. 

Taylor  goes  on  to  identify  which  elements  of  a  Lanchester  type  force-on-force  level  attrition 
model  could  be  replaced  with  models  composed  of  more  fundamental  objects  in  accord  with 
the  reductionist  paradigm.  These  include  attrition  rate  coefficients  and  force  composition 
(homogenous  versus  heterogeneous),  but  does  not  explore  how  this  might  be  achieved  in 
practice. 

The  price  of  continuing  the  reductionist  agenda  and  including  models  of  component 
phenomena,  is  that  the  analytic  approach  to  exploring  the  system  properties  has  proven  to 
have  limited  application.  A  simulation  based  approach  is  generally  used  instead. 

The  origins  of  quantitative  models  of  combat  attrition  lie  in  the  early  20th  century  among 
several  authors  working  independently.  Fowler  [12]  reviews  the  pioneering  work  by  Chase, 
Fiske,  Lanchester  and  Osipov,  giving  mathematical  descriptions  of  their  models.  All  of  which 
can  be  regarded  as  variations  of  the  same  basic  concepts  and  can  be  treated  as  particular  cases 
of  the  generalised  Lanchester  Equations. 

Fowler  develops  a  detailed  mathematical  examination  of  the  solutions  to  these  equations 
using  several  different  approaches,  emphasising  a  number  of  particular  combat  types 
including  both  the  linear  and  square  law  forms  of  the  Lanchester  equations.  In  addition  to 
examination  of  Lanchester's  coupled  differential  equations,  Fowler  includes  an  examination  of 
the  assumptions  that  underpin  them  and  their  solutions. 

Fowler  also  attempts  to  combine  historical  analysis  with  the  mathematical  analysis  of  combat 
models.  In  common  with  other  authors,  such  as  Helmbold  [4],  he  does  not  consider  why  the 
behaviour  of  data  from  a  collection  of  unrelated  battles  can  often  be  described  using 
relationships  developed  from  a  model  of  the  evolution  of  a  single  battle.  Although  the 
functional  relationships  between  initial  and  final  states  are  the  same  for  such  a  collection  of 
battles,  their  different  and  unrelated  controlling  parameters  (such  as  attrition  ratio)  should 
have  unconstrained  values  resulting  in  a  range  of  final  state  values  that  obscure  the  functional 
relationships  (as  noted  by  Hartley  [5]).  Fowler's  other  comments  of  the  use  of  historical 
analysis  give  the  impression  that  he  does  not  value  its  contribution  highly,  nor  indeed  any 
analysis  that  does  not  proceed  from  first  principles.  The  work  of  Dupuy  [13],  while  noted  for 
its  aesthetic  form,  is  dismissed  as  merely  empirical  and  lacking  theoretical  foundation.  While  a 
number  of  these  criticisms  are  justified,  the  contribution  that  can  be  made  by  historical 
analysis  and  empirical  studies  in  general  are  a  subject  of  the  present  work. 

In  his  pursuit  of  a  description  of  combat  attrition  from  first  principles,  most  of  Fowler's  work 
consists  of  an  examination  of  the  derivation  of  values  for  the  attrition  coefficients  that  appear 
in  Lanchester's  Equations.  Evaluation  of  these  coefficients  is  one  of  the  issues  previously 


4 


UNCLASSIFIED 


UNCLASSIFIED 


DSTO-TR-2643 


identified  by  Taylor  [11]  as  requiring  further  investigation.  As  part  of  this  examination  Fowler 
undertakes  a  comprehensive  study  of  the  wide  range  of  factors  that  contribute  to  the 
evaluation  of  an  attrition  coefficient,  including  Bonder-Farrell  theory  of  attrition  rate 
coefficients,  the  theory  of  kill  chains  and  heterogeneous  engagements.  The  examination  of 
stochastic  behaviour  of  Lanchester's  Equations,  using  the  Fokker  Planck  equation,  is  less 
concerned  with  studying  variability  in  combat  outcomes  than  in  justifying  the  standard 
equilibrium  approximation  whereby  the  variables  in  the  equations  are  replaced  by  their 
expectation  values.  His  findings  are  consistent  with  the  application  of  the  "law  of  large 
numbers",  supporting  a  conclusion  that  Lanchester's  Equations  are  more  applicable  for  larger 
systems.  The  sensitivity  of  the  system's  evolution  to  its  initial  conditions  is  also  explored.  In 
addition  to  examining  the  equations  governing  the  evolution  of  the  expectation  values  for  the 
system's  variables,  Fowler  also  developed  equations  governing  the  evolution  of  the  variances 
and  covariances  of  those  variables.  Noting  the  lack  of  correlation  between  force  strength  and 
combat  success,  he  has  advanced  the  idea  that  each  side's  perception  of  its  chance  for  success 
is  related  to  the  behaviour  of  the  system's  variances  and  covariances. 

The  aggregation  of  one-to-one  engagements  into  many  on  many  engagements  is  extensively 
covered,  again  from  first  principles,  starting  with  detection  and  tracking  theory  and  considers 
time  and  range  dependent  effects  on  the  probability  of  killing  a  target.  The  influence  of 
weather,  topology  and  weapon  performance  (ballistic  or  guided)  is  also  included  in  his 
treatment  of  the  determination  of  aggregate  attrition  coefficients,  along  with  command  and 
control  problems  in  distributing  targets  among  multiple  firers. 

Fowler's  work  is  probably  the  most  comprehensive  review  of  combat  attrition  modelling 
presently  available  at  the  unclassified  level  and  is  a  good  guide  to  the  complexity  of  large 
aggregate  military  models  that  have  been  used  such  as  the  USAF's  Thunder  and  its 
replacements.  However,  the  detailed  scope  of  what  is  required  by  such  a  model  also  makes 
clear  why  such  models  fail  to  provide  understanding.  It  is  very  difficult  to  relate  trends  in  the 
model's  outcome  to  causes,  resulting  in  simpler,  more  empirical  models  such  as  Lanchester's 
Equations,  retaining  considerable  utility. 

2.2  Holistic 

Phenomena  such  as  emergence  are  believed  to  impose  limits  for  the  application  of 
reductionism.  In  linear  systems,  the  interactions  between  all  components  is  obtained  from  the 
superposition  of  all  possible  pairwise  component  interactions.  Nonlinearity  may  produce 
additional  effects,  not  predicted  by  the  properties  of  individual  components  or  their  simple 
interactions,  in  systems  formed  from  large  numbers  of  interacting  components.  Therefore  at 
each  stage  in  the  aggregation  of  components  to  produce  objects  on  a  higher  level  of 
organisation,  new  concepts  and  generalisations  must  be  added  that  do  not  arise  from  the 
properties  of  its  components  [14], 

The  holistic  approach  is  the  idea  that  all  of  the  properties  of  a  complex  system  cannot  be 
explained  by  summing  the  behaviour  of  its  component  parts  alone  [15].  In  contrast  to  the 
reductionist  program  above,  aggregating  a  system's  component  parts  is  insufficient  to  provide 
a  realistic  description  and  requires  additional  phenomena  or  interactions  which  cannot  be 
deduced  from  them  in  order  to  produce  an  accurate  description.  This  model  making 
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paradigm  requires  the  developers  to  determine  whether  additional  constructs  are  required  by 
the  model  at  each  level  of  aggregation  of  its  component  parts.  The  reductionist  viewpoint 
regards  these  additional  phenomena  as  empirical  and  lacking  rigorous  justification. 

Dupuy's  Quantified  Judgement  Model  (QJM)  [13]  typifies  holistic  combat  models.  It  is  an 
empirical  expression  of  Clausewitz's  “Law  of  Numbers" ,  in  which  historical  analysis  of  combat 
outcomes  is  used  to  determine  approximate  numerical  values  for  its  parameters.  It  is  an 
example  of  the  force  scoring  approaches  reviewed  by  Jaiswal  [16].  The  combat  power  of  a  side 
is  described  in  terms  of  its  theoretical  force  strength  and  parameters  describing  the  impact  of 
operational  factors.  The  force  strength  is  a  weighted  sum  of  the  lethality  (killing  power)  of  the 
elements  making  up  that  force.  The  weighting  takes  into  account  the  impact  of  weather, 
terrain  and  the  spatial  dispersion  of  the  force.  Battle  outcomes  depend  on  the  combat  power  of 
the  two  sides.  The  principal  criticisms  of  the  QJM  approach  are  its  complexity,  often 
contradictory  formulation,  reliance  on  military  judgement  to  determine  values  for  certain 
parameters  and  its  lack  of  a  scientifically  rigorous  foundation  [12],  However,  this  appears  to 
be  little  more  than  the  reductionist  view  of  the  holistic  approach. 


3.  The  Role  of  Historical  Analyses 

Both  the  reductionist  and  holistic  approaches  require  data  analysis.  The  main  difference 
between  them  is  that  the  reductionist  approach  only  uses  such  analysis  to  determine  the 
properties  of  the  most  basic,  fundamental,  objects  in  its  hierarchical  system  deconstruction. 
These  properties  constitute  the  "first  principles"  from  which  all  other  behaviours  can  be 
developed.  The  data  used  at  this  level  of  analysis  is  generally  at  the  scale  of  individual 
performance  and  interactions,  which  as  a  result  does  not  include  the  effects  of  any  collective 
behaviour. 

In  addition  to  making  use  of  this  data  analysis  of  the  basic  objects,  the  holistic  approach  looks 
for  additional  properties  and  interactions  that  arise  from  collective  behaviour  at  each  level  of 
object  aggregation.  This  is  obtained  by  additional  data  analysis  of  larger  scale,  collective, 
interactions  including  up  to  entire  battles.  Such  data  cannot  in  general  be  produced  through 
controlled  experiments  or  exercises  and  must  be  obtained  from  the  historical  record.  This  use 
of  historical  data  analysis  in  the  formulation  of  combat  models  is  a  major  difference  between 
these  two  approaches. 

There  have  been  numerous  attempts  to  compare  historical  combat  data  with  the  behaviour 
expected  from  combat  models,  including  the  work  of  Helmbold  [4]  and  Hartley  [5] .  Hartley 
also  includes  a  comprehensive  review  of  the  effort  to  validate  combat  attrition  laws  using 
historical  analysis.  In  contrast  to  those  approaches.  Hartley  emphasises  development  of  a 
combat  model,  including  attrition,  directly  from  historical  battle  data.  Such  analysis  identifies 
relationships  between  many  combat  factors  including  force  size,  posture,  casualties,  surprise 
and  duration.  Mathematical  expressions  of  these  relationships  are  developed  using  standard 
regression  analysis  techniques  and  significance  tests.  Key  casualty  relationships  are  shown  to 
be  consistent  with  the  expectations  of  the  mixed  Lanchester  attrition  equations. 
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To  validate  differential  models  of  attrition,  force  and  casualty  numbers  for  both  sides 
intermediate  to  the  starting  and  finishing  values  are  required.  That  level  of  detail  is  rarely 
available  and  often  does  not  exist.  Hartley's  approach  uses  only  initial  and  final  values  of 
engaged  force  strengths.  A  number  of  such  compilations  exist,  which  have  been  aggregated 
by  Hartley.  The  component  databases  were  created  by  different  workers  for  a  variety  of 
different  purposes.  The  potential  for  bias  and  error  in  the  data  is  carefully  considered, 
especially  for  battles  prior  to  the  19th  Century.  Hartley  argues  that  this  database  constitutes  a 
random  sample,  because  the  individual  datasets  comprising  the  database  were  independently 
derived.  While  such  aggregation  will  improve  the  accuracy  of  statistical  estimators  of  dataset 
properties,  the  conclusion  that  while  the  database  is  not  a  true  random  sample  it  can  be 
treated  as  if  it  were  effectively  random,  requires  further  justification.  It  is  difficult  to  avoid  the 
conclusion  that  the  database  is  little  more  than  an  aggregate  of  accidental  sampling  databases. 

Lanchester's  Equations  describe  the  behaviour  of  a  single  system  in  time.  However,  the 
historical  databases  contain  information  about  an  ensemble  of  battles,  each  potentially  with 
different  values  of  attrition  rate  coefficients  a  and  b.  Should  the  results  from  such  an  ensemble 
follow  the  behaviour  expected  of  a  single  system?  Hartley  has  examined  this  issue  at  length. 
He  considered  several  hypotheses,  rejecting  all  save  the  conclusion  that  the  relationship 
between  the  data  from  an  ensemble  of  different  battles  was  a  direct  consequence  of  the 
equations  governing  the  attrition  process.  In  other  words,  the  behaviour  governing  an 
individual  battle  was  reflected  in  the  behaviour  of  an  ensemble  of  battles.  The  bulk  of 
Hartley's  work  is  concerned  with  the  development  and  examination  of  a  model  constructed 
from  this  historical  analysis. 

3.1  Interpreting  the  Historical  Record 

Despite  the  large  number  of  recorded  battles  throughout  history,  the  number  with  usable  data 
is  small.  Any  compilation  of  battle  data,  being  a  subset  of  all  battles,  constitutes  a  sample.  A 
useful  database  will  have  a  sample  of  battle  data  that  is  representative  of  patterns  observed  in 
the  population  of  all  battles.  There  are  many  issues  which  must  be  considered  when 
attempting  to  use  historical  data,  including: 

•  potential  bias  in  narrative  accounts  of  the  battle  due  to  most  accounts  being  written  by 
the  victor  or  for  propaganda  purposes, 

•  many  reported  results  are  qualitative  or  approximate, 

•  many  reported  results  for  the  same  battle  disagree,  including  dispute  over  which  side 
won, 

•  when  determining  force  strengths  should  support  or  service  personnel  be  included, 

•  when  determining  casualties  should  prisoners  be  included, 

•  how  should  force  strength  be  obtained  from  numbers  of  participants,  should  some 
form  of  force  scoring  such  as  the  QJM  be  used, 

•  how  should  the  effect  of  leadership,  initiative,  surprise,  terrain  and  weather  be 
included, 

•  how  is  the  boundary  of  a  battle  defined,  should  strategic  airpower  or  naval  gunfire 
support  be  included. 
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•  how  should  the  effect  of  reserves  be  included,  should  the  availability  of  uncommitted 
forces  be  included, 

•  should  a  battle  be  considered  as  a  single  event,  or  does  it  more  closely  resemble  a 
related  series  of  events  (phases)  separated  in  space  and  time. 

This  last  point  is  critical.  Most  battles  can  be  regarded  as  a  series  of  events  that  occur  both 
consecutively  and  concurrently.  Both  sides  may  be  the  attacker  in  different  phases  of  the  same 
battle,  leading  to  dispute  over  who  is  the  attacker.  Each  phase  should  strictly  be  considered  as 
a  separate  battle,  as  originally  intended  by  Lanchester  [1],  However,  under  many  conditions 
they  can  be  aggregated  into  a  single  battle.  How  a  large  battle  is  segmented  into  smaller 
actions  can  substantially  affect  its  analysis. 

Most  authors  using  historical  data  have  attempted  to  address  some  of  these  issues  [4,  5], 
especially  questions  of  how  to  determine  force  strength  and  casualties.  However  there 
remains  a  question  regarding  the  accuracy  of  much  of  the  original  reporting,  especially  for 
battles  prior  to  the  19th  Century.  Because  different  methods  and  standards  may  have  been 
used  in  recording  each  battle,  the  data  will  always  contain  inconsistencies  and  be  subjective  to 
some  extent.  But  this  is  also  true  of  current  military  activities  and  must  be  accepted  as 
representing  a  limit  on  the  accuracy  of  any  analysis  using  real  world  data.  These  difficulties 
have  left  many  in  the  field  doubtful  over  the  utility  of  historical  analysis  [12],  as  no  universal 
solution  exists  to  these  problems.  Accepting  that  such  problems  cannot  be  entirely  eliminated, 
an  approach  to  mitigate  their  impact  will  be  considered  in  a  following  section. 

3.2  Attrition  Model  Validation 

As  already  mentioned,  to  validate  differential  models  of  attrition  such  as  Lanchester' s 
Equations,  force  and  casualty  numbers  for  both  sides  at  times  intermediate  to  the  starting  and 
finishing  times  are  required.  That  level  of  detail  is  rarely  available  and  often  does  not  exist. 
The  author  is  aware  of  four  studies:  Engel's  [17]  pioneering  work  on  the  Iwo  Jima  campaign, 
Busse's  work  on  the  Inchon  campaign  [18],  Bracken's  study  of  the  Ardennes  campaign  [19] 
and  Lucas's  examination  of  the  battle  of  Kursk  [20].  As  noted  previously,  Lanchester's 
Equations  are  not  a  model  of  combat,  only  a  model  for  combat  attrition.  Each  of  these  studies 
had  first  to  segment  the  data,  using  narrative  accounts  of  the  battle,  and  extract  those  changes 
that  were  the  result  of  attrition  from  all  other  changes.  How  a  large  battle  is  segmented  into 
smaller  actions  can  substantially  affect  its  analysis.  Each  analysis  made  decisions  regarding 
the  inclusion  of  non-combat  personnel,  made  more  difficult  by  the  combat  effect  of  external 
participants  such  as  US  Naval  gunfire  at  Iwo  Jima  and  Inchon.  The  studies  differed  on  the 
consistency  of  the  historical  data  with  the  expectation  from  Lanchester's  Equations.  The  Iwo 
Jima  analysis  found  broad  consistency,  while  the  Kursk  analysis  found  that  the  segmentation 
into  phases  was  more  important  in  explaining  the  observed  casualty  patterns.  Much  of  the 
problem  is  due  to  the  lack  of  sufficient  data  for  both  sides  intermediate  force  and  casualty 
values. 

The  four  studies  above  each  contained  from  20  to  40  time  correlated  records  of  force  strength 
and  casualties  for  both  sides.  This  sequence  of  values  had  to  be  segmented  into  smaller 
sequences  due  to  the  application  of  external  variables  such  as  the  arrival  of  reinforcements. 
This  resulted  in  sequences  of  related  events  available  for  analysis  generally  being  much 
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smaller,  and  only  rarely  consisting  of  as  many  as  20  events.  Determination  of  the  form  of  the 
controlling  attrition  relationship  is  then  made  using  regression  analysis.  However,  studies  of 
the  relationship  between  sample  size  and  precision  in  such  analyses  [21]  show  that  10  times  as 
many  observations  are  required  as  there  are  parameters  in  the  regression  model  to  obtain  a  90 
%  confidence  in  the  prediction  of  their  values.  Lanchester's  Equations  have  2  such  parameters. 
The  uncertainty  arising  from  the  use  of  the  short  data  sequences  available  in  the  historical 
record  is  a  major  factor  in  the  poor  discrimination  between  competing  attrition  models 
reported  in  these  studies. 

3.3  Analysis  for  Ensembles  of  Battles 

Lanchester's  attrition  equations  describe  the  behaviour  of  a  single  system  in  time.  However,  as 
can  be  seen  from  Equation  2,  the  initial  and  final  states  of  a  battle  are  related.  Hence  a 
database  of  such  information  from  a  collection  of  historical  battles  does  contain  information 
about  the  attrition  processes  that  govern  their  evolution.  But,  each  battle  in  such  compilations 
potentially  has  different  values  of  the  attrition  rate  coefficients  a  and  b.  Consequently, 
examination  of  the  dependence  of  the  left  hand  side  of  Equation  1  should  not  yield  any 
interesting  functional  dependence,  as  the  attrition  ratio  is  independent  of  initial  or  final  state 
values.  This  is  not  what  is  observed  when  the  data  is  examined,  as  originally  reported  by 
Helmbold  [4],  Hartley's  data  compilation  [5]  is  shown  in  Figure  1  plotting  the  natural 
logarithm  (In)  of  the  left  hand  side  of  Equation  2  (known  as  the  Helmbold  Ratio)  against  the  In 
of  the  initial  force  ratio. 


The  colour  of  each  data  point  indicates  the  identity  of  the  winning  side.  A  more  detailed 
examination  of  Hartley's  data  analysis  follows  later.  While  the  data  exhibits  considerable 
scatter,  a  clear  trend  is  apparent. 


An  early  explanation  for  this  behaviour  was  found  if  the  attrition  coefficients  are  not  constants 
but  depend  on  the  force  ratio.  Such  behaviour  can  be  explained  in  terms  of  battlefield 
congestion  preventing  a  side  from  making  full  use  of  its  available  forces  and  thus  reducing  the 
effective  attrition  rate  against  its  opponent.  Lanchester's  Equations,  modified  to  include  the 
effect  of  "diminished  marginal  returns",  are  given  in  Equation  3: 
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VICTOR  •  •  ■  Attacker  •  •  •  Defender  - a=  1.35 

Figure  1:  Helmbold's  Relationship,  from  Hartley  [5], 

If  the  attrition  functions  f(  )  and  g(  )  are  simple  power  laws,  as  shown  in  the  right  hand 
expressions  of  Equation  3,  the  resulting  equation  of  state  (also  known  as  the  Helmbold 
Equation)  [4]  is  shown  in  Equation  4.  This  equation  is  consistent  with  the  equation  of  state  in 
Equation  2;  subject  to  the  additional  assumption  above  that  "diminished  marginal  returns" 
constrain  the  values  that  the  attrition  coefficients  a  and  b  can  take.  This  equation  can  also  be 
regarded  as  a  statement  of  that  constraint. 
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Bearing  in  mind  that  equation  4  describes  the  evolution  of  a  single  battle,  the  similar 
behaviour  shown  by  the  data  describing  the  ensemble  of  battles  (Figure  1)  was  considered 
remarkable  [5]. 

Previous  work  by  the  author  [6]  has  shown  that  this  equation  of  state  also  results  from 
Lanchester's  original  equations,  when  the  spatial  distribution  of  each  side's  forces  is  modelled 
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stochastically  where  the  probability  is  fractally  distributed  (power  law).  Both  derivations  are 
based  on  the  same  principle,  a  side's  spatial  distribution  limits  its  ability  to  target  enemy 
forces  thus  reducing  its  effective  attrition  coefficient  which  then  depends  on  the  force  ratio. 

Helmbold's  pioneering  work  on  historical  battle  analysis  [4]  made  the  assumption  that  the 
attrition  coefficients  were  approximately  the  same  for  all  battles.  Hartley  [5]  sought  to  relax 
this  assumption  and  has  examined  this  issue  at  length.  He  considered  several  hypotheses 
rejecting  all  save  the  conclusion,  albeit  more  empirical  than  rigorous,  that  the  relationship 
between  the  data  from  an  ensemble  of  different  battles  was  a  direct  consequence  of  the  form 
of  the  equations  governing  the  attrition  process  of  a  single  battle.  In  other  words,  the 
behaviour  governing  an  individual  battle  was  reflected  in  the  behaviour  of  an  ensemble  of 
battles.  Indeed,  Helmbold's  original  work  on  the  validation  of  Lanchester's  Equations  using 
historical  data  found  this  applies  to  a  number  of  different  parameters  including  the  defender's 
advantage. 

Further  consideration  as  to  why  this  similarity  of  behaviour  exists  will  be  deferred  until 
Section  8.  The  present  work  will  accept  that  the  observed  behaviour  of  such  an  ensemble  of 
battles  will  follow  the  behaviour  expected  of  that  parameter  during  the  course  of  a  single 
battle. 

3.4  Issues  in  Database  Development 

The  fidelity  of  historical  analysis  is  dependent  on  choice  of  an  appropriate  data  sample.  If  the 
sample  is  representative  of  the  population,  parameter  estimators  derived  from  the  sample  will 
also  be  representative  of  the  value  of  that  parameter  in  the  population,  within  error  limits 
determined  by  sample  selection  and  size  [22] .  How  the  sample  is  obtained  has  considerable 
influence  on  how  representative  of  the  population  it  is,  with  random  sampling  techniques 
considered  the  least  influenced  by  bias  and  error. 

A  sample  of  objects  from  a  population  is  random  if  all  the  members  of  the  population  have  an 
equal  chance  of  appearing  in  the  sample.  This  applies  to  all  members  of  the  population, 
exceptional  as  well  as  typical  members.  Otherwise  a  correlation  between  the  quantity  being 
measured  and  probability  of  appearing  in  the  sample  can  result  in  the  value  of  the 
parameter's  estimator  being  very  different  from  the  value  of  that  parameter  in  the  population. 

When  sampling  a  heterogeneous  population  the  precision  achieved  can  be  increased  and  the 
risk  of  bias  reduced  by  dividing  the  population  into  sections,  each  relatively  homogeneous, 
and  sampling  each  section  (or  stratum)  separately.  Estimates  obtained  for  each  stratum  can 
then  be  combined  to  give  the  estimate  for  the  whole  population.  If  entire  groups  of  a 
heterogeneous  population  are  excluded  from  a  sample,  there  are  no  adjustments  that  can 
produce  representative  estimates  of  the  entire  population.  However,  if  some  groups  are 
under-represented  and  the  degree  of  under  representation  can  be  quantified,  then  sample 
weights  can  compensate  for  the  bias. 

When  the  population  being  sampled  is  extensive  or  complex,  the  practical  problems  in  taking 
a  simple  random  sample  are  great,  and  the  time  taken  for  even  a  small  sample  may  be  large. 
The  difficulty  in  obtaining  a  sample  of  a  given  size  may  be  greatly  reduced  by  carrying  out  the 


UNCLASSIFIED 


11 


UNCLASSIFIED 

DSTO-TR-2643 

sampling  in  two  stages.  First  the  complete  population  may  be  divided  into  a  number  of 
distinct  primary  units  or  sub-populations,  and  from  these  a  sample  is  taken.  From  each  of 
these  sampled  sub-populations  a  secondary  sample,  or  sub-sample  is  taken. 

The  least  useful  and  most  subject  to  bias  of  all  sampling  procedures,  accidental  sampling, 
involves  using  what  is  available  and  most  convenient  as  a  sample  pool. 

Some  of  the  difficulties  in  compiling  historical  battle  samples  have  already  been  discussed 
above.  A  number  of  such  data  compilations  were  aggregated  and  used  by  Hartley.  The 
component  databases  were  put  together  by  different  workers  for  a  variety  of  different 
purposes.  The  aggregate  database  covers  a  wide  range  of  force  ratios  and  while  emphasising 
20th  Century  battles,  has  reasonable  coverage  back  to  1600.  It  emphasises  land  battles,  but 
includes  one  air  campaign.  Hartley  argued  that  the  individual  datasets  comprising  the 
database  produce  a  random  sample  upon  aggregation,  because  they  were  independently 
derived.  This  argument  is  similar  to  the  inverse  of  Bootstrap  sampling  [23],  which  has  been 
used  to  improve  the  accuracy  of  measures  of  sample  statistical  descriptors.  Aggregation  will 
improve  the  accuracy  of  statistical  estimators,  but  does  not  affect  bias.  It  is  difficult  to  avoid 
the  conclusion  that  Hartley's  database  is  little  more  than  an  aggregate  of  accidental  sample 
databases. 

All  battle  databases  are  the  product  of  the  recursive  application  of  the  sub-sampling  process. 
The  population  consists  of  all  battles.  This  is  first  sampled  to  produce  the  set  of  all  recorded 
battles.  Many,  especially  smaller  engagements,  are  never  recorded.  The  requirement  that  both 
the  initial  and  final  values  of  forces  strengths  are  known  produces  another  sub-sampling  stage 
to  generate  the  set  of  all  recorded  battles  with  usable  data.  This  sampling  process  also 
discriminates  against  smaller  battles.  Larger  battles  receive  more  attention  and  hence  are  more 
likely  to  have  their  attributes  recorded.  All  battle  databases  are  themselves  samples  of  that 
sample.  Even  if  the  final  sampling  process  was  random,  the  process  of  recording  history 
generates  an  intrinsic  bias  towards  larger  battles.  This  bias  cannot  be  eliminated  and  any 
analysis  technique  must  include  procedures  for  identifying  and  dealing  with  that  bias. 

Hartley  looked  for  the  effects  of  bias  in  his  data  compilation  by  dividing  it  into  a  series  of 
partitions  using  criteria  such  as  date,  size,  attacker/ defender  identity  as  well  as  campaign  and 
original  data  source.  Given  that  each  of  Hartley's  data  sources  included  a  different  spread  of 
such  values,  each  partition  produced  a  different  sub-sample  of  battles.  Trends  that  were 
observed  in  the  value  of  a  sample  estimator  in  a  partition  would  indicate  the  presence  of  a 
correlation  between  the  value  of  the  parameter  defining  that  partition  and  the  probability  of 
being  included  in  the  sample.  That  estimator  would  then  be  subject  to  bias.  Hartley's  analysis 
was  primarily  concerned  with  the  behaviour  of  Equation  4  between  the  data  partitions.  He  did 
not  observe  any  significant  differences,  all  observations  being  comparable  to  the  estimated 
error,  leading  to  the  conclusion  that  the  bias  would  not  have  measurable  effects  for  his 
conclusions. 

However,  bias  was  observed  by  the  author  on  examining  the  number  of  battles  of  particular 
sizes  in  the  database  [7] .  How  the  way  in  which  history  is  recorded  leads  to  an  inherent  bias  in 
such  compilations  has  already  been  described.  Each  analysis  technique  must  include 
procedures  for  identifying  and  dealing  with  that  bias.  Given  this  requirement  for  dealing  with 
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bias,  it  is  more  important  to  establish  consistency  in  the  analysis  from  different  data 
compilations.  Consistency  in  the  results  from  analysis  of  different  databases,  established  using 
different  methodologies  and  different  primary  sources,  is  an  indication  that  bias  in  addition  to 
the  inherent  historical  bias  above  has  little  effect. 

3.5  Historical  Database  Instantiation 

Both  Helmbold  and  Hartley  sourced  their  data  primarily  from  research  undertaken  in  the  first 
half  of  the  Twentieth  Century.  Both  compilations  recorded  significantly  more  information 
than  just  initial  and  final  strengths.  By  restricting  the  data  compilation  undertaken  for  the 
present  work  to  just  the  initial  and  final  strengths  increases  the  number  of  battles  for  which 
suitable  data  has  been  recorded.  Moreover,  in  the  last  quarter  of  the  Twentieth  Century, 
significant  amounts  of  new  research  has  become  readily  available.  A  number  of  different 
factors  has  led  to  this  increase,  changed  conditions  in  Europe  has  enabled  researchers  to  access 
sources  previously  little  explored,  especially  in  Eastern  Europe.  Popular  interest  in  military 
history  has  led  to  more  detailed  scrutiny  of  archives,  enabling  earlier  work  to  be  reviewed  and 
long  forgotten  sources  to  be  rediscovered.  Although  it  is  important  to  guard  against 
revisionist  tendencies  among  historians  with  a  particular  agenda  to  pursue2. 

The  present  work  has  developed  a  compilation  of  historical  battle  data,  building  on  Hartley's 
compilation,  using  these  advantages.  This  has  enabled  the  number  of  battles  included  to 
increase  from  around  750  to  around  1600.  Each  battle  was  checked  against  the  most  recent 
available  data  and  earlier  inaccuracies  have  been  corrected.  Previously,  where  some  data  was 
disputed,  the  battle  had  been  included  in  the  database  multiple  times  with  each  entry 
corresponding  to  a  different  interpretation,  such  as  when  the  winning  side  was  disputed. 
More  recent  research  has  enabled  most  of  these  discrepancies  to  be  resolved,  and  each  battle 
now  corresponds  to  a  single  entry.  Where  sources  disagreed,  the  consensus  opinion  or  values 
were  followed.  The  internet  has  provided  a  means  to  facilitate  large  scale  collaborative 
research  on  a  scale  not  previously  possible.  Comprehensive  archives,  especially  for  subjects  of 
popular  interest  such  as  the  American  Civil  War  [24]  and  Napoleonic  Wars  [25],  have  been 
produced  by  collaboration  between  enthusiasts  and  professionals  using  primary  and 
secondary  sources.  Each  entry  in  the  database  developed  in  the  present  work  results  from 
many  sources  and  opinions3.  This  process  is  sometimes  known  as  the  "Wisdom  of  Crowds" 
[26]  and  is  analogous  to  the  process  of  Bootstrap  sampling  [23],  with  its  resultant 
improvement  in  accuracy. 

The  availability  of  more  sources,  both  primary  and  secondary,  from  both  combatant  and  third 
party  observers  has  enabled  wider  views  on  the  progress  and  outcome  of  battles  to  be  heard. 
While  not  preventing  bias,  which  still  occurs,  the  availability  of  alternate  opinions  mitigates 
against  much  possible  bias  in  recent  battle  analyses. 

The  presence  of  service  and  support  troops  play  an  important  part  in  the  capacity  of  combat 
troops  to  engage  in  and  sustain  combat.  Their  contribution  is  sufficient  to  justify  their 


2  Such  as  those  seeking  to  restore  Hans  Delbmck’s  agenda  or  rehabilitate  the  reputation  of  Gen.  D.  Haig. 

3  For  an  example  of  the  detail  now  available  for  an  increasing  number  of  battles  see:  “1805:  Austerlitz”  by  R. 
Goetz,  Greenhill  Books,  London,  2005. 
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inclusion  in  battle  strengths,  where  listed  separately.  While  this  has  been  done  for  service 
troops  in  direct  support  of  a  particular  battle,  service  troops  in  a  more  general  support  role, 
possible  supporting  several  battles  have  not  been  included  and  in  general  their  numbers  are 
not  known  reliably  anyway. 

Where  separately  reported,  prisoner  numbers  have  not  been  included  in  casualty 
determination.  When  small,  this  represents  at  most  a  small  error  in  the  combat  impact  of 
losses.  When  large,  the  prisoners  generally  resulted  from  actions  undertaken  after  the 
cessation  of  major  combat  and  did  not  influence  the  outcome.  This  also  applies  to  other  non¬ 
combat  casualties  and  is  the  reason  for  the  exclusion  of  most  sieges  from  consideration.  The 
exception  here  is  when  the  siege  ended  as  the  result  of  a  single  assault. 

The  most  appropriate  representation  of  a  force's  combat  strength  is  to  record  the  strength  of 
each  type  of  combat  participant  and  develop  separate  attrition  expressions  for  their 
interactions.  Heterogeneous  attrition  models  involve  many  interactions  and  the  resulting 
combinatorial  "explosion"  greatly  increases  their  complexity,  which  also  reduces  their  utility 
and  comprehension.  Furthermore,  historical  data  rarely  includes  detailed  force  compositions 
leading  to  considerable  uncertainty  in  estimated  values.  A  common  way  to  reduce  the 
attrition  model's  complexity  is  to  construct  a  homogenous  force  strength  determined  using 
some  form  of  force  scoring  methodology  such  as  QJM.  However,  the  effects  of  uncertainty  in 
the  composition  of  the  force  still  remains.  Indeed,  given  that  a  comprehensive  historical 
database  must  include  the  effects  of  a  wide  range  of  weapons  with  considerable  differences  in 
lethality  (comparing  a  spear  with  a  modern  Main  Battle  Tank  for  example),  the  likelihood 
exists  that  the  resulting  relative  force  strengths  may  be  little  more  than  an  artefact  of  the  force 
scoring  methodology.  It  is  not  clear  that  such  methods,  at  least  for  the  purposes  of  the  present 
work,  are  any  more  reliable  than  a  simple  comparison  of  the  number  of  participants,  which 
was  the  method  chosen  for  this  database  development. 

The  decision  as  to  what  comprises  a  single  battle,  its  boundary  in  space  and  time,  is  a  decision 
that  must  be  taken  separately  for  each  battle  after  considering  the  battle  narrative.  Each  battle 
was  selected  for  inclusion  in  an  attempt  to  only  consider  battles  that  were  thought  to 
constitute  a  single  engagement  in  terms  of  Lanchester's  original  conception.  The  timing  and 
availability  of  reserves  also  affected  this  decision  as  well  as  the  force  size. 

As  mentioned  previously,  the  bias  introduced  by  the  process  of  recording  history  is  intrinsic 
and  must  be  allowed  for  in  subsequent  analysis.  If  no  suitable  reason  for  choosing  one  source 
over  another  existed,  the  author  made  a  judgement  call  on  which  version  would  be  used. 
While  this  is  unlikely  to  be  always  correct,  it  is  at  least  self-consistent  in  the  presentation  of 
data. 

The  data  recorded  for  each  battle  included  its  identifying  name  and  year,  as  well  as  a  generic 
identifier  describing  the  conflict  and  technology/ tactics  employed  to  facilitate  segmentation  of 
the  database  into  groups  of  roughly  similar  battles.  For  both  sides  of  the  battle  the  initial  and 
final  strengths  are  recorded  as  well  as  that  side's  principal  posture  (attacker,  defender)  and  its 
final  status  (winner,  loser).  A  summary  of  which  appears  in  the  following  table. 
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Table  1:  Dataset  Segmentation  and  Summary 


Data-segment 

Epoch 

Start 

Year 

End 

Year 

Number  of 
Battles 

Attacker 

Victories 

Defender 

Victories 

Ancient 

-490 

1598 

63 

36 

27 

17th  Century 

1600 

1692 

93 

67 

26 

18th  Century 

1700 

1798 

147 

100 

47 

Revolution 

1792 

1800 

238 

168 

70 

Empire 

1805 

1815 

327 

203 

124 

ACW 

1861 

1865 

143 

75 

68 

19th  Century 

1803 

1905 

126 

81 

45 

WWI 

1914 

1918 

129 

83 

46 

WWII 

1920 

1945 

233 

165 

68 

Korea 

1950 

1950 

20 

20 

0 

Post  WWII 

1950 

2008 

118 

86 

32 

4.  Comparison  with  Previous  Work 

The  population  of  all  battles  throughout  history  cannot  be  documented.  Recorded  history  is 
only  a  sample  of  those  events  that  took  place,  and  as  already  described,  that  sample  is 
fundamentally  biased  and  accidental  in  nature.  Random  sub-sampling  of  those  recorded 
events  does  not  change  this  basic  property  of  the  resulting  sample.  However,  if  different 
methodologies  for  sub-sampling  produce  consistent  results  then  the  observed  patterns  of 
behaviour  can  be  considered  as  indicative  of  behaviour  in  the  source  sample  and  not  artefacts 
of  the  sub-sampling  process.  In  particular,  if  analysis  of  the  data  gives  the  same  results,  both 
before  and  after  the  effects  of  bias  in  the  data  has  been  addressed,  the  result  can  be  considered 
as  insensitive  to  the  effect  of  bias  and  indicative  of  actual  behaviour  in  recorded  history. 
Comparison  of  the  behaviour  of  the  database  developed  in  the  present  work  with  Hartley's 
database  provides  this  consistency  check. 

Not  all  of  Hartley's  approaches  to  segmenting  his  database  have  been  examined  here. 
Attacker/  Defender  pairs  were  not  examined  as  they  contain  too  few  data  points  to  provide 
worthwhile  analysis.  The  effect  of  outliers  was  not  considered.  This  study  examines  the 
distributions  of  many  results  around  their  mean  values.  Whether  a  datum  is  an  outlier  or  an 
instance  of  an  extreme  (low  probability)  event  will  have  a  strong  effect  on  the  tail  of  the 
resulting  distribution.  Bias  could  be  introduced  by  arbitrarily  removing  data  points  from 
consideration  based  on  perceived  differences  in  behaviour,  when  such  data  points  may  be 
useful  in  illustrating  dependencies. 

Equation  4  describes  the  key  relationship  to  be  used  for  comparison  of  Hartley's  work.  Figure 
1,  with  the  database  developed  in  the  present  work.  The  analogous  results  for  the  current 
database  are  shown  in  Figure  2. 
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Figure  2:  Helmbold's  Relationship  using  the  current  database.  Attacker  victories  are  coloured  red 
while  defender  victories  are  green. 

Least  squares  regression  generated  the  best  fit  lines  shown.  The  dashed  line  represents  the  full 
data  set  with  solid  lines  representing  the  data  segmented  by  the  victor's  posture.  Equations 
describing  the  best  fit  lines  for  attacker  victories  and  defender  victories  are  also  shown. 
Attacker  victories  are  common  for  data  both  above  and  below  the  overall  best  fit  line  (dashed) 
describing  the  average  battle  outcome,  while  defender  victories  occur  predominantly  above 
that  line.  Attackers  initially  hold  the  initiative  in  battle,  in  that  they  can  choose  whether  to 
attack  or  not.  If  their  assessment  of  the  likelihood  of  success  is  not  favourable  they  will 
generally  choose  not  to  attack,  which  may  explain  the  difference  between  attacker  and 
defender  successes  shown  in  Figure  2. 


The  gradient  a  in  Equation  4  is  the  principal  parameter  characterising  the  behaviour  of  the 
datasets.  It  determines  how  sensitive  a  battle's  outcome  (specified  by  the  Helmbold  Ratio)  is 
to  changes  in  the  initial  force  ratio.  If  a  has  a  dependence  on  force  size  (not  ratio),  technology 
(spears  versus  machine  guns),  winner's  posture  (attacker/ defender)  or  another  significant 
discriminating  factor  between  battles,  then  analysis  of  this  dataset  would  be  subject  to  bias.  If 
a  does  not  exhibit  such  behaviour,  the  results  drawn  from  such  analysis  can  be  considered  as 
insensitive  to  such  effects. 


The  regression  parameters  obtained  for  the  dataset  as  a  whole,  and  for  the  dataset  segmented 
according  to  the  victor's  posture  as  well  as  the  battle's  epoch  are  given  in  Table  2  below.  The 
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value  for  a,  the  standard  error  in  a,  the  regression  coefficient  of  determination  and  the 
maximum  and  minimum  values  for  a  using  a  95%  confidence  interval  are  given  for  each 
dataset.  Values  for  Hartley's  database  are  also  given  for  comparison.  The  Korea  epoch 
contained  too  few  entries  to  undertake  this  analysis  and  was  not  considered. 


Table  2:  Dataset  Segmentation  Regression  Parameters 


Dataset 

a 

err 

R2 

min  a 

max  a 

Unsegmented 

Hartley's  Database 

1.38 

0.06 

0.41 

1.26 

1.50 

Current  Database 

1.44 

0.04 

0.43 

1.36 

1.52 

Segmented  by  Posture 
Attacker  Victories 

1.55 

0.04 

0.58 

1.46 

1.62 

Defender  Victories 

Segmented  by  Epoch 

1.42 

0.06 

0.51 

1.30 

1.53 

Ancient 

2.48 

0.34 

0.49 

1.80 

3.16 

17th  Century 

1.62 

0.34 

0.20 

0.94 

2.30 

18th  Century 

1.47 

0.12 

0.50 

1.23 

1.72 

Revolution 

1.30 

0.11 

0.36 

1.07 

1.53 

Empire 

1.27 

0.10 

0.32 

1.07 

1.47 

ACW 

1.02 

0.15 

0.25 

0.72 

1.31 

19th  Century 

1.82 

0.12 

0.66 

1.59 

2.06 

WWI 

1.22 

0.11 

0.48 

1.00 

1.44 

WWII 

1.20 

0.11 

0.35 

0.99 

1.42 

Post  WWII 

1.63 

0.10 

0.67 

1.41 

1.84 

The  current  database  values  for  maximum  and  minimum  a,  as  well  as  those  segmented  by 
posture,  define  an  overlap  in  the  region  1.46  to  1.52.  This  common  region  is  also  consistent 
with  Hartley's  results.  More  variation  in  the  value  of  a  is  observed  between  each  epoch.  If,  for 
the  moment,  the  values  from  the  Ancient  and  American  Civil  War  (ACW)  epochs  are  ignored 
as  outliers,  examination  of  the  values  for  maximum  and  minimum  a  again  show  a  good 
overlap,  although  not  as  good  as  when  segmented  by  posture. 

The  agreement  in  the  values  of  a  observed  in  Table  2  is  as  good  as  that  found  by  Hartley. 
Within  a  95%  confidence  interval  the  possibility  that  a  single  value  for  a  characterises  each  of 
these  datasets  cannot  be  discounted.  The  observed  values  of  a  may  then  be  regarded  as 
indicative  of  its  value  in  the  overall  population  and  not  an  artefact  of  the  sampling  process. 

The  low  values  for  the  coefficient  of  determination  are  also  significant.  One  standard 
interpretation  of  this  value  is  the  fraction  of  the  observed  variation  in  the  natural  logarithm 
(In)  of  the  Helmbold  Ratio  that  can  be  explained  by  the  variation  in  the  value  of  the  In  of  the 
Force  Ratio.  The  small  values  indicate  that  other  factors  are  responsible  for  most  of  the 
observed  variation.  A  possible  interpretation  of  this  variation  will  be  explored  in  a  later 
section. 

Explanation  of  the  value  for  a  observed  for  the  Ancient  epoch  requires  closer  examination. 
The  value  of  the  Helmbold  Ratio  is  shown  plotted  against  its  corresponding  Force  Ratio  in 
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Figure  3,  segmented  by  the  victor's  posture.  Least  squares  regression  generated  the  best  fit 
lines  shown.  The  dashed  line  represents  the  full  epoch  data  with  solid  lines  representing  the 
data  segmented  by  the  victor's  posture.  The  observed  value  of  a  for  attacker  victories  is 
similar  to  that  observed  for  defender  victories.  Both  values  of  which  are  significantly  lower 
than  the  2.48  reported  for  the  epoch  as  a  whole.  Against  expectation,  in  this  dataset  attacker 
victories  are  more  common  for  lower  Force  Ratio  values  while  defender  victories  are  more 
common  at  higher  Force  Ratio  values.  Regression  of  the  dataset  as  a  whole  has  correlated 
these  low  Force  Ratio  attacker  wins  with  the  high  Force  Ratio  defender  wins,  resulting  in  the 
large  observed  value  of  a.  This  effect  occurs  to  some  degree  in  all  epoch  data  segments,  but  is 
more  pronounced  here  due  to  the  large  vertical  separation  between  the  attacker  and  defender 
sub-sets.  A  similar  conclusion  can  be  drawn  for  the  ACW  epoch. 


Figure  3:  Helmbold's  Relationship  for  Ancient  Epoch  battles.  Attacker  victories  are  coloured  red 
while  defender  victories  are  green. 

In  most  of  the  analyses  in  the  present  work,  attacker  and  defender  values  will  be  separated  to 
prevent  this  form  of  aliasing  biasing  the  results. 

It  is  important  to  determine  whether  any  systematic  trends  in  the  value  of  a  exist.  A  trend 
over  time  (and  hence  possibly  of  technology)  can  be  examined  using  the  data  from  Table  1 
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above.  To  enable  comparison  with  Hartley's  examination  of  this  trend,  the  data  was  not 
segmented  by  the  winner's  posture.  Ignoring  the  Ancient  epoch  (outlier),  a  year  representative 
of  each  epoch  was  found  by  determining  the  average  year  for  all  battles  constituting  the 
epoch.  The  value  of  observed  a  plotted  against  its  representative  year  is  shown  in  Figure  4. 


Figure  4:  The  value  of  the  gradient  a  against  representative  year.  Current  database  values  are 
coloured  orange  and  Hartley's  values  are  coloured  blue. 

It  can  be  immediately  noted  that  Hartley's  values  are  consistent  with  the  data  from  Table  2. 
The  small  values  for  the  coefficient  of  determination  indicates  that  the  systematic  changes  in  a 
over  time  are  not  significantly  different  from  zero.  More  importantly,  this  means  that  changes 
related  to  time  (such  as  technology)  do  not  have  a  significant  impact  on  the  outcome  of  battles. 

A  trend  in  a  with  battle  size  would  also  be  significant.  It  is  a  little  more  difficult  to  quantify  as 
size  can  be  determined  in  a  number  of  ways.  Most  workers  define  the  size  of  a  battle  as  the 
total  of  all  forces  involved  in  the  battle  (both  sides).  This  can  be  misleading  as  Helmbold  [4] 
reported  that  the  attacker's  strength  is  not  strongly  correlated  with  the  defender's  strength. 
For  small  and  mid-sized  battles  (up  to  40000  per  side),  the  correlation  between  the  strengths  of 
the  two  sides  is  poor,  as  can  be  seen  in  Figure  5.  Using  the  total  strength  as  a  measure  of  size 
can  then  mask  trends  that  depend  on  a  side's  strength. 
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Figure  5:  The  Defender's  initial  strength  as  a  function  of  the  Attacker's  initial  strength.  Attacker 
victories  are  coloured  red  while  defender  victories  are  green. 

Two  different  means  of  quantifying  battle  size  were  examined:  the  size  of  the  attacking  force 
and  the  size  of  the  defending  force.  Dividing  the  1083  battles  for  which  the  attacker  was 
victorious  into  quartiles  using  these  factors  enables  a  comparison  of  the  value  of  a, 
determined  from  regression  analysis  in  each  quartile,  for  different  sized  battles.  The  results 
are  given  in  Table  3. 

Table  3:  Measurement  of  Attacker  Victories'  a  by  battle  size  Quartiles 

Data  Ordered  By: 

Quartile  Defender's  Strength  Attacker's  Strength 


a 

err 

a 

err 

1 

1.53 

0.08 

1.56 

0.08 

2 

1.60 

0.09 

1.50 

0.08 

3 

1.73 

0.09 

1.24 

0.09 

4 

1.91 

0.07 

1.59 

0.07 

Clearly  a  does  not  depend  on  battle  size  when  determined  by  the  attacker's  force  size,  but 
does  depend  on  battle  size  when  determined  by  defender's  force  size.  This  complex 
dependency  on  battle  size  was  not  observed  by  Hartley,  who  classified  battles  using  total 
strengths  involved. 
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Given  the  unequal  range  of  battle  sizes  in  each  quartile,  it  is  informative  to  consider  the 
dependence  of  a  on  actual  battle  size.  The  average  battle  size  for  each  quartile  was  taken  as 
representative  of  that  quartile.  These  results  are  shown  in  Figure  6. 


Figure  6:  The  value  of  the  gradient  a  against  Battle  Size.  Battle  sizes  determined  by  Attacker  size  are 
coloured  orange  and  by  Defender  size  are  coloured  blue.  Standard  error  bar  sizes  are  shown 
for  all  points. 

A  logarithmic  dependence  for  a  on  defender's  size  is  observed.  Logarithmic  dependences  on 
battle  size  have  been  observed  in  many  parameters  related  to  combat  [4].  A  possible 
explanation  for  the  observed  behaviour  can  be  found  on  further  consideration  of  the 
Lanchester's  Equations  used  in  this  analysis.  Equation  3,  in  which  the  rates  of  change  are 
determined  by  a  single  term.  If  however,  the  attrition  rates  are  governed  by  a  "mixed  law" 
Lanchester  Equation  [12]  with  a  polynomial  strength  dependence  such  as: 

dx  2 

—  =  -ay  +  cy  +... 

dt  (5) 

dZ  =  -bx  +  dx>+... 

dt 

the  residual  behaviour  of  the  non-linear  terms  could  result  in  an  apparent  dependence  of  the 
coefficient  a  on  force  strength  as  observed.  The  effect  of  such  terms  have  been  considered  by 
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Woodcock  and  Dockery  [27]  in  their  examination  of  the  use  of  Catastrophe  Theory  for  combat 
modelling. 

Catastrophe  Theory  is  a  method  for  examining  non-linear  dynamics  where  the  potential 
function  describing  the  system's  evolution  is  treated  as  a  folded  manifold  [28].  Such  manifolds 
allow  multiple  input  values,  for  some  parameter,  for  the  same  output  value  over  a  restricted 
domain.  The  potential  function  defining  the  manifold  is  obtained  from  the  dynamics  of  the 
system  under  study.  Catastrophe  Theory  analyses  degenerate  critical  points  of  the  potential 
function  where  not  just  the  first  derivative,  but  one  or  more  higher  derivatives,  of  the  potential 
function  are  also  zero.  These  are  called  the  germs  of  the  catastrophe  geometries.  The 
degeneracy  of  these  critical  points  can  be  examined  by  expanding  the  potential  function  as  a 
power  series  in  small  perturbations  of  its  parameters.  There  are  nine  basic  catastrophe  types. 

The  major  problem  for  the  application  of  Catastrophe  Theory  to  combat  is  in  the  definition  of 
the  manifold  potential.  Current  theories  of  combat  are  only  able  to  produce  part  of  the  germ 
for  any  of  the  basic  catastrophe  types.  Unfortunately,  the  terms  in  the  germ  that  had  no 
mechanism  to  support  their  inclusion  are  critical  for  the  catastrophe  behaviour.  Catastrophe 
Theory  was  not  considered  further  in  the  present  work. 

This  completes  the  comparison  of  results  using  the  database  developed  for  the  present  work 
with  previous  results.  It  should  be  clear  that  the  results  presented  above  are  consistent  with 
Hartley's  findings.  The  lack  of  variation  observed  in  the  value  of  the  coefficient  a,  using 
different  methods  to  segment  the  database,  supports  the  conclusion  that  the  bias  inherent  in 
historical  data  does  not  produce  an  observable  effect  in  the  analyses.  Therefore,  the  database 
may  be  taken  as  representative  of  the  real  world  and  not  merely  as  an  artefact  of  the  sampling 
methodology. 


5.  Stochastic  Forms  of  Lanchester's  Equations 

Put  simply,  a  stochastic  process  is  described  by  a  variable  whose  value  changes  in  time  in  an 
unpredictable  way.  Such  processes  can  be  discrete,  when  the  variable's  value  can  only  change 
at  specified  fixed  points,  or  continuous  when  the  value  can  change  at  any  time.  Stochastic 
processes  may  also  take  continuous  values,  when  the  underlying  variable  can  take  any  value 
within  a  specified  range,  or  discrete  values  where  only  certain  specified  values  are  allowed. 

A  Markov  process  is  a  particular  type  of  stochastic  process  where  only  the  current  value  of  a 
variable  is  relevant  for  predicting  its  future  evolution.  A  continuous  time,  discrete  value 
Markov  process  has  been  demonstrated  to  produce  a  stochastic  attrition  model  analogous  to 
Lanchester's  deterministic  equations  [29].  Most  modern  combat  simulations  use  Markov 
processes  to  describe  attrition.  The  stochastic  theory  of  attrition  has  been  comprehensively 
explored  by  a  number  of  workers  and  is  readily  accessible  [29]. 

Stochastic  analogues  of  Lanchester's  Equations  are  a  specific  case  of  the  stochastic  linear 
system  of  equations  for  two  stochastic  variables: 


22 


UNCLASSIFIED 


UNCLASSIFIED 


DSTO-TR-2643 


dx  =  A(x,  y)dt  +  Si  (x,  y)dzi ,  x(0)  =  x0 

(6) 

dy  =  B(x,  y)dt  +  S2  (x,  y)dz2 ,  y  (0)  =  y0 

where  A  and  B  are  functions  of  the  stochastic  variables  x  and  y,  and  possibly  also  of  time  t, 
that  describe  the  regular  and  stochastic  evolution  of  x  and  y.  The  functions  Si  and  Sz  describe 
the  magnitude  of  the  stochastic  variance  in  x  and  y  resulting  from  the  action  of  the  stochastic 
functions  Zi  and  zi .  The  form  of  Zi  and  Z2  depends  on  the  type  of  stochastic  process  being 
investigated.  Normal  probability  distributions  are  generally  used  for  continuous  stochastic 
variables  where  the  law  of  large  numbers  is  assumed  to  hold.  Although  continuous  variables 
are  only  an  approximation,  the  difference  between  adjacent  allowed  values  of  the  force 
strength  variables  is  much  smaller  than  the  magnitude  of  the  strengths  themselves.  As  such, 
this  will  be  the  only  case  considered  in  the  present  work. 

Following  the  approach  of  Black  and  Scholes  [32],  stochastic  analogues  of  Lanchester's  Square 
Law  are  obtained  when  linear  dependences  on  the  complementary  variables  are  used  in 
Equation  6.  This  substitution  produces  Equation  7,  which  is  the  same  as  the  system  studied  by 
Amacher  and  Mandallaz  (their  Equation  6): 

dx  =  -yadt  -  <jxdz2  )y,  x(0)  =  x0 

dy  =  -(bdt-a2dzl)x,  v(0)  =  y0 

Most  studies  of  the  application  of  stochastic  forms  of  Lanchester's  Equations  have  tended  to 
emphasise  its  Markov  process  properties  [11,  12] 4.  The  present  work  will  examine  the 
relationship  between  observed  behaviours  and  the  parameters  defining  this  stochastic  system 
separately.  It  is  not  proposed  to  provide  comprehensive  or  rigorous  coverage  of  the  methods 
of  stochastic  calculus  or  their  application  to  the  study  of  analogues  to  Lanchester's  Equations. 
A  rigorous  treatment  of  stochastic  calculus  theory  can  be  found  in  the  book  by  Klebaner  [30] 
and  its  specific  application  to  solution  of  attrition  equations  in  the  work  of  Amacher  and 
Mandallaz  [31] .  Using  a  matrix  formulation  of  the  system,  they  developed  an  analytic  general 
solution  for  the  expectation  values  of  the  stochastic  variables  (actually  the  square  of  the 
stochastic  variables)  in  the  form  of  an  infinite  series  product  of  matrices.  Interestingly,  they 
chose  not  to  report  any  further  analytic  investigations  into  the  behaviour  of  the  system, 
preferring  instead  to  examine  their  solutions  numerically  with  an  examination  of  synthetic 
data  obtained  through  simulation  of  the  behaviour  of  the  dynamical  system.  The  same 
starting  conditions  were  used  and  the  results  of  1000  runs  of  the  system  were  studied.  Their 
conclusions  are  interesting.  Battle  duration  had  a  highly  skew  distribution,  to  the  extent  that 
mean  battle  duration  was  found  not  to  be  a  useful  summary  statistic.  Significant  departures 
from  a  normal  distribution  were  observed  for  casualty  values.  They  considered  this 
surprising,  which  is  in  itself  surprising  since  the  emergence  of  a  log-normal  distribution  for 
casualties  is  a  clear  consequence  of  the  model.  The  outcomes  of  the  simulations  were  found  to 
be  sensitive  to  small  variations  in  the  initial  conditions.  A  key  conclusion  of  their  work  was 
the  importance  of  battle  termination  conditions  for  the  simulation's  outcome.  In  short,  the 


4  By  which  is  meant  an  examination  of  the  behaviour  resulting  from  the  stochastic  nature  of  Equation  7 
without  separating  the  effects  resulting  from  the  mean  and  variance  terms  inside  the  brackets  of  Equation  7. 
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general  solution  while  interesting,  is  not  particularly  useful.  This  pioneering  work  does  not 
appear  to  have  been  followed  up. 

Previous  work  by  the  author  [7]  also  examined  solutions  to  stochastic  forms  of  Lanchester's 
Equations  (Equation  7).  In  contrast  to  Amacher  and  Mandallaz,  a  general  solution  was  not 
sought,  instead  focussing  on  solution  of  approximations  to  the  equations  themselves.  This 
enabled  the  functional  form  of  the  force  strength  variables  to  be  more  easily  identified  and 
compared  with  historical  data  using  the  same  approach  as  that  outlined  in  the  present  work. 
A  log-normal  distribution  for  casualty  values  was  found  which  was  consistent  with  the 
expectation  of  the  approximated  equations  and  that  observed  by  Amacher  and  Mandallaz 
[31].  A  further  difference  between  Amacher  and  Mandallaz  and  the  author's  work  is  in  the 
interpretation  given  to  the  origin  of  the  stochastic  term  in  Equation  7,  which  they  regarded  as 
merely  the  result  of  time  dependent  random  disturbances  of  the  attrition  processes.  The 
author's  previous  work  has  attempted  an  interpretation  of  this  contribution  and  shown  how  it 
can  arise  from  interactions  between  the  "system"  of  forces  in  combat  and  the  remaining  non¬ 
combat  processes  affecting  those  forces. 

5.1  Ito's  Change  of  Variable  Method 

A  standard  approach  in  stochastic  calculus  is  to  employ  a  change  of  variable  method, 
commonly  ascribed  to  Ito,  to  explore  the  evolution  of  quantities  defined  from  the  system's 
stochastic  variables  [30],  This  approach  is  essentially  the  same  as  that  used  by  Fowler  [12]  who 
used  the  Fokker-Planck  equation  to  examine  the  transition  probability  density  in  his 
formulation  of  stochastic  attrition.  This  approach  has  also  been  used  to  derive  the  Black- 
Scholes  differential  equation  [32], 

Let  /  =  f(x,y,t)  be  a  function  of  two  stochastic  variables  and  time  such  that  /  is  twice 
differentiable  in  x  and  y  and  also  once  differentiable  in  t.  Following  Navin's  [30]  informal 
approach  to  the  development  of  Ito's  lemma: 

df  =  —dt  +  —dx  +  —dy  +  —  —^(dxdx)  +  —L^ldydy)  +  lldxdy)  ^  ^  +...(8) 
dt  dx  dy'lydx  dy  "  dxdy J 

This  expression  can  be  simplified,  if  the  relationships  between  the  dynamical  variables  are 
known,  by  substituting  the  first  and  second  order  differentials  into  Equation  8.  This  rule  can 
be  applied  to  the  examination  of  stochastic  forms  of  Lanchester's  Equations  by  first  limiting 
consideration  to  systems  that  are  described  by  Equation  6.  This  provides  the  first  order 
differential  terms.  Using  simplified  notation  [30]  (ignoring  the  delta  functions),  on  taking  the 
expectation  values,  the  second  order  terms  are  similarly  obtained.  For  example: 

{dxdx^j  =  (iyAdt  +  S]dz] \Adt  +  Sldzl — >  A2 dt2  +  2 Sydzdt  +  S{dz2  (9) 

However,  as  dt  — »  0 ,  it  can  be  shown  [30]  that: 
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( 'dz — »  dt 
(dt2^j  — »  0 
( 'dtdz )  — »  0 


which  gives: 

( dx2)^Sfdt 
(dy2')  — > 


(10) 


(11) 


Both  the  Zi  and  Z2  terms  from  Equation  6  can  then  be  replaced  by  dt,  as  the  integrations  over  zi 
and  Z2  are  independent  except  for  the  cross  term,  from  which  follows: 


dxdy)  — >  SlS2dzldz2 


(12) 


Which  in  the  integral  over  the  stochastic  processes  can  be  replaced  by 


dxdy)  =  S1S2pdt 


(13) 


Where  p  is  the  correlation  coefficient  between  the  two  stochastic  processes  which  have  an  as 
yet  unspecified  degree  of  coherence.  This  allows  Ito's  rule.  Equation  8,  to  be  written  as: 


df  = 


df  A  J  DS/,  1  c2  d2f  1  C2  82f  oc,  d2f 

- 1-  A  —  dx  +  B  —  dv  4 —  S,  — t — l —  S1  — z — h  S,  S-,  p - 

dt  dx  dy  2  1  dx2  2  dy2  dxdy 


dt  + 


5/  _  df"' 

S ,  —  +  S,p  — 

1  cbc  2 


(14) 


dz 


for  dynamical  systems  described  by  Equation  6.  This  expression  can  then  be  used  to  examine 
how  any  function /defined  for  the  system  under  study  changes  over  time. 


5.2  Further  Application  of  the  Change  of  Variable  Method 

The  relevant  Ito  change  of  variable  rule  is  further  simplified  by  explicit  specification  of  the 
undefined  functions  of  Equation  14.  When  the  instantiation  of  the  general  linear  system  that  is 
given  in  Equation  7  is  considered,  the  change  of  variable  rule  applicable  to  the  stochastic 
analogue  of  Lanchester's  Square  Law  is  obtained. 


df  = 


df  df  ,  ,  df  ,  l  2  2d2f  l  2  ,d2/ 

ay—dx  -  bx—dy  +  —  crx  y  — r  +  —  a2x~  +  <jyj2pxy 


dt  "  dx  dy  '  2  w  dx2  2  2  dy1  1  z'  '  dxdy 


dt  + 


f  df  df A 

vjyr  +  ViPXyr 
ox  cy  J 


(15) 


dz 
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A  number  of  quantities  of  interest  for  the  Lanchester  system  with  constant  rate  coefficients 
(Equation  7)  were  examined  using  this  approach.  All  but  one  failed  to  produce  any 
relationship  of  interest.  Consider  again  the  modified  equation  of  state  known  as  the  Helmbold 
Equation: 


f(x,y,t)  =  In 


f  2 


2  A 


2  2 

Uo-y  ) 


=  a  In 


f  \ 
x, 


y  o 


+  J3 


(16) 


The  right  hand  side  depends  only  on  initial  values  and  therefore  is  a  constant.  Hence  df=  0. 
For  this  to  hold  true  for  all  values  of  x,  y  and  t,  it  must  hold  true  separately  for  the  component 
terms  in  both  dt  and  dz  after  application  of  Equation  15  to  the  centre  expression  of  Equation 
16. 


Considering  the  term  for  the  dt  component,  an  expression  can  be  obtained  which  must  be 
identically  equal  to  zero  in  order  that  df=  0.  After  some  rather  lengthy  algebra  to  simplify  the 
expression,  which  the  interested  reader  can  find  in  Appendix  A,  the  following  relationship 
can  be  obtained  that  must  also  hold  true  in  order  that  df=  0. 


x2  a 2  cr2  x2 

fi 

l  Xo) 

2  i  2  ~  2  2 

(  2  > 

y  b  ^2  t0 

i+L 

v  y.j 

(17) 


The  functional  relationship  between  the  strengths  for  both  sides  can  be  more  easily  seen  if  X= 
x/x0  and  Y=  y/y0-  Then: 


f 

1  + 

V _ 

c 

1  + 

v 


F2y 


=  F 


(18) 


This  relationship  can  be  considered  as  a  corollary  to  Helmbold' s  Equation  and  will  be 
examined  using  the  historical  database  in  a  subsequent  section. 


6.  Analyses  of  the  Distribution  of  Historical  Data 

A  cursory  examination  of  the  analysis  of  the  historical  database  presented  in  Figures  1  to  3 
shows  considerable  scatter  in  the  results  around  the  mean  values.  Given  the  logarithmic  scales 
employed,  it  is  clear  that  the  scatter  in  the  results  is  as  important  for  the  relationship  between 
dependent  and  independent  variables  as  the  underlying  relationship  between  them  of 
Equation  4.  The  pattern  in  the  scatter  of  casualty  results  from  the  historical  database  has 
already  been  shown  to  be  consistent  with  a  log-normal  distribution  [7].  The  observed  log- 
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normal  distribution  of  casualties  was  expected  on  the  basis  of  an  approximate  solution  to 
stochastic  forms  of  Lanchester's  Equations  of  Equation  7  and  the  observation  that  the 
historical  data  was  predominantly  from  the  region  of  the  approximation's  validity. 

This  section  will  examine  the  new  expanded  historical  database  for  consistency  with  the 
results  of  the  previous  work  and  also  expand  the  analysis  of  the  patterns  in  the  distributions 
in  the  results.  The  enlarged  database  now  contains  sufficient  records  to  permit  analysis  of  the 
distributions  found  from  different  segments  of  the  overall  database,  and  look  for 
dependencies.  It  will  also  expand  the  examination  to  quantities  other  than  those  previously 
examined. 

Ordinary  Least  Squares  Regression  (LSR)  was  used  in  this  analysis  in  preference  to  the 
currently  popular  Maximum  Likelihood  Estimation  (MLE)  method  [33].  MLE  is  believed  to  be 
more  robust  in  handling  data  that  do  not  meet  the  requirements  for  rigorous  LSR  (normally 
distributed  residuals  etc).  However,  with  data  known  to  be  affected  by  bias,  LSR  is  generally 
more  useful  in  being  easier  to  understand  and  because  of  the  availability  of  well-established 
diagnostic  tools  [34],  The  use  of  residual  plots  in  particular  was  extensively  used  in  the 
present  work  to  identify  data  affected  by  bias  and  adjust  the  analysis  accordingly. 

6.1  Distribution  of  Initial  Strengths 

Prior  to  an  examination  of  the  distribution  of  battle  casualties,  it  is  necessary  to  consider  the 
distribution  of  initial  strength  values  for  evidence  of  the  bias  referred  to  in  Section  3. 

The  frequency  distribution  of  initial  force  sizes  was  determined  by  dividing  the  range  of  force 
sizes  into  intervals  of  10000  participants  and  counting  the  number  of  times  a  force  strength 
from  the  database  occurred  in  each  interval.  This  is  shown  on  a  logarithmic  scale  in  Figure  7. 
Noting  the  low  correlation  between  the  magnitude  of  attacker  and  defender  force  initial 
strengths  (Figure  5),  this  distribution  counts  the  initial  strength  of  both  sides  separately  rather 
than  the  combined  forces  total.  Each  battle  therefore  contributes  two  data  points  to  the 
distribution. 
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Figure  7:  Force  Size  Distribution,  and  regression  coefficient  of  determination 

For  the  analysis  presented  in  Figure  7,  examination  of  the  residuals  confirmed  the  effect  of 
small  battle  under-representation  had  biased  the  smallest  data  value.  The  difference  between 
the  observed  value  and  expected  value  can  be  used  to  compensate  for  bias  in  the  cumulative 
number  distributions  in  the  following  analysis.  The  discrete  nature  of  the  dependent  variable 
defines  the  measurement  resolution.  Where  the  value  being  measured  is  of  the  same  size  as 
the  measurement  resolution,  the  results  can  be  strongly  affected  by  fluctuations.  Residual 
plots  also  identified  which  data  is  affected  by  fluctuations.  In  Figure  7  this  corresponds 
roughly  to  the  region  where  the  logarithm  of  the  Force  Initial  Strength  is  greater  than  12, 
which  could  then  be  ignored  by  the  regression  analysis.  The  regression  line  (and  coefficient  of 
regression)  shown  in  Figure  7  was  produced  after  excluding  those  data  points  subject  to  bias 
and  fluctuations. 


It  should  also  be  noted  that  the  distribution  of  initial  force  sizes  does  not  exhibit  any 
indication  of  the  influence  of  a  normal  distribution.  The  completely  different  behaviour  of  the 
initial  strength  frequency  and  the  casualty  frequency  supports  the  contention  that  such 
behaviour  results  from  the  attrition  process  and  is  not  an  artefact  of  the  sampling  or  analysis 
procedure. 


28 


UNCLASSIFIED 


UNCLASSIFIED 


DSTO-TR-2643 


6.2  Distribution  of  Casualties 

The  distribution  of  the  natural  logarithm  of  each  side's  battle  casualties  was  determined  by 
dividing  the  range  of  observed  logarithm  of  casualty  values  into  intervals  of  size  1,  which  is 
equivalent  to  the  size  for  adjacent  intervals  having  a  ratio  of  1.65.  It  results  in  an  even  spread 
of  casualty  values  on  a  logarithmic  scale  which  is  necessary  for  the  accurate  representation  of 
its  distribution.  The  number  of  times  the  logarithm  of  the  casualty  value  from  the  database 
occurred  in  each  interval  was  then  counted.  This  is  shown  in  Figure  8. 


Figure  8:  Number  Distribution  of  Battles,  Cumulative  Distribution  of  Such  Battles,  and  Theoretical 
Cumulative  Normal  Distribution  vs.  In(Casualties) 

The  frequency  distribution  forms  a  bell  shaped  curve,  but  with  considerable  stochastic 
variability  across  the  peak.  This  limits  the  ability  to  determine  what  form  the  distribution 
takes,  in  particular  whether  it  is  consistent  with  a  normal  distribution.  The  cumulative 
casualty  distribution  was  formed  by  summing  the  number  of  occurrences  with  casualties 
greater  than  the  specified  value  and  is  also  shown  in  Figure  8.  The  cumulative  distribution  for 
occurrences  greater  than  the  specified  value  was  chosen  as  it  confines  the  effect  of  data  bias  to 
a  few  entries  at  the  lower  end  of  the  scale  instead  of  incorporating  the  bias  in  all  the  data 
points. 

The  previous  section  has  established  that  a  bias  in  favour  of  larger  battles  in  the  historical 
record  does  indeed  exist.  This  bias  can  be  allowed  for  by  calculating  the  theoretical  curve  that 
the  distribution  would  follow  assuming  a  normal  distribution  with  the  mean  and  standard 
deviation  of  the  historical  data,  but  using  the  results  from  Figure  7  to  estimate  the  number  of 
small  battles  "missing"  from  the  database.  Figure  8  also  shows  this  theoretical  cumulative 
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frequency  distribution.  This  analysis  counts  the  casualties  of  both  sides  separately  rather  than 
the  combined  total.  Each  battle  therefore  contributes  two  data  points  to  the  distribution. 

Ignoring  the  lowest  strata  of  data,  where  bias  is  expected  to  produce  under-representation,  the 
close  agreement  between  the  historical  data  cumulative  probability  distribution  and  the 
expected  theoretical  probability  distribution  is  apparent,  as  in  the  previous  work.  Ignoring  the 
data  points  affected  by  bias,  the  correlation  coefficient  between  the  observed  and  theoretical 
distribution  was  evaluated  as  0.997.  These  results  are  again  consistent  with  the  expectation  of 
the  proposed  stochastic  forms  of  Lanchester's  Equations. 

The  database  size  used  in  the  previous  work  and  the  prodigious  amount  of  data  required  to 
undertake  frequency  analysis  with  any  degree  of  reliability,  limited  that  study  to  examination 
of  the  data  as  a  single  coherent  sample.  The  larger  size  of  the  current  database  permits 
frequency  analysis  to  be  undertaken,  with  some  degree  of  confidence,  when  the  database  is 
segmented  according  to  the  side's  posture  and  size  of  the  battle. 

6.2.1  Segmented  by  Posture 

The  above  analysis  procedure  was  also  applied  to  only  the  casualties  of  the  attacking  force 
from  each  battle  in  the  database,  the  results  of  which  are  shown  in  Figure  9.  The  number  of 
battles  in  the  sample  is  of  course  half  that  obtained  from  Figure  7,  as  each  battle  previously 
contributed  two  casualty  values  (one  attacker  and  one  defender). 


Figure  9:  Number  Distribution  of  Battles,  Cumulative  Distribution  of  Such  Battles,  and  Theoretical 
Cumulative  Normal  Distribution  vs.  In(Casualties)  for  the  attacking  side. 
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A  similar  conclusion  can  be  drawn  regarding  the  behaviour  of  the  casualty  distribution  for  the 
attacking  side,  again  consistent  with  the  expectation  of  the  stochastic  forms  of  Lanchester's 
Equations.  Repeating  this  procedure  using  only  casualties  for  the  defending  side  yields  similar 
results. 

6.2.2  Segmented  by  Outcome 

A  similar  analysis  can  be  undertaken  when  the  data  is  segmented  according  to  the  battle's 
outcome  and  the  casualty  distributions  of  the  winning  and  losing  sides  determined.  Results 
very  similar  to  those  of  Figure  9  were  obtained  for  both  sides  and  were  consistent  with  the 
expectation  of  the  stochastic  forms  of  Lanchester's  Equations.  It  is  more  interesting,  however, 
to  compare  the  observed  distribution  for  the  winning  and  the  losing  sides  which  are  shown  in 
Figure  10. 


Figure  10:  Number  Distribution  of  Battles  vs.  In(Casualties).  Winner  casualties  are  coloured  red  while 
loser  casualties  are  green. 

The  two  casualty  distributions  are  similar,  but  that  for  the  loser  has  larger  values  for  both  the 
mean  and  variance  than  that  for  the  winners.  The  difference  in  mean  values  is  statistically 
significant.  Examination  of  battle  narratives  may  provide  an  explanation  that  does  not  imply 
greater  rates  of  attrition.  In  particular,  the  inclusion  of  prisoners  in  the  casualty  values  affects 
the  loser  more  than  the  winner.  While  prisoners  are  an  important  component  of  casualties, 
especially  in  the  determination  of  combat  end  points,  they  do  not  result  from  attrition.  The 
losing  side  may  also  be  subjected  to  pursuit  and  desertion,  both  of  which  affect  the  loser  more 
than  the  winner.  An  effort  was  made  during  database  construction  not  to  include  prisoners 
taken  after  combat  in  the  reported  casualty  values,  however  prisoners  taken  during  combat 
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cannot  be  easily  separated  from  the  casualties  due  to  attrition  in  most  of  the  data.  All  of  which 
produces  higher  casualty  values  for  the  loser  without  the  need  to  invoke  a  larger  attrition  rate. 


While  the  difference  between  the  mean  values  of  the  distributions  are  statistically  significant, 
the  difference  between  the  variances  is  small  enough  that  a  90%  confidence  limit  cannot 
regard  them  as  representing  different  distributions.  Solutions  to  the  stochastic  differential 
equations  [31]  show  that  the  mean  value  depends  on  both  the  systematic  and  stochastic 
contributions  and  hence  on  attrition  rates,  while  the  variance  depends  only  on  the  stochastic 
part.  These  observations  are  consistent  with  stochastic  processes  acting  evenly  on  both  sides  of 
a  battle. 

6.2.3  Segmented  by  Force  Size 

Examination  of  any  dependence  for  the  distribution  of  casualties  on  the  size  of  the  force  was 
straightforward  and  just  required  the  casualty  values  to  be  ordered  according  to  its  side's 
initial  strength.  Forces  with  the  same  initial  strength  but  different  casualty  values  were 
ordered  according  by  the  casualty  value.  The  casualty  values  were  then  divided  into  quartiles, 
based  on  this  ordering  by  force  initial  size.  The  use  of  quartiles  to  specify  force  size  was 
necessary  to  ensure  enough  data  was  available  to  determine  the  casualty  distribution  with  a 
degree  of  reliability.  Summary  statistics  for  the  results  are  given  below. 

Table  4:  Summary  Statistics  for  In  (Casualties)  using  Force  Initial  Size  Quartiles 


Quartile 

Mean 

Variance 

Standard 

Error 

Kurtosis 

Skewness 

1 

5.28 

2.31 

0.05 

-0.02 

-0.39 

2 

6.38 

1.69 

0.04 

-0.30 

-0.31 

3 

7.05 

1.70 

0.05 

-0.37 

-0.25 

4 

8.95 

2.26 

0.05 

0.19 

0.18 

The  values  for  skewness  and  kurtosis  are  consistent  with  distributions  close  to  normal,  and 
hence  log-normal  for  the  casualty  values.  The  change  in  sign  for  these  quantities  for  the  4th 
Quartile  (largest  battles)  may  result  from  the  inclusion  of  a  small  number  of  extremely  large 
battles  in  the  database.  The  resulting  casualty  distributions  are  shown  in  Figure  11.  Further 
segmentation  based  on  the  side's  posture  was  not  possible,  as  can  be  seen  from  the  variability 
across  the  peak  of  the  each  of  the  distributions. 

Repeating  the  previous  analysis  for  these  results,  indicates  that  the  distributions  are  also 
consistent  with  the  expectations  of  the  stochastic  attrition  process.  Mean  casualty  values 
increase  with  battle  size  (quartile)  but  the  change  in  the  variance  is  not  statistically  significant. 
The  distribution  variance  depends  on  the  magnitude  of  the  stochastic  contribution  to  the 
attrition  rate  in  Equation  7  ( <n  and  05)  is  again  seen  to  act  evenly  in  all  battles. 
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Figure  11:  Number  Distribution  of  Battles  vs.  In(Casualties)  for  different  initial  size  forces.  1st 
Quartile  values  are  blue,  2nd  Quartile  are  black,  3rd  Quartile  are  purple  and  4th  Quartile  is 
brown. 


6.3  Results  Distributions  in  Helmbold's  Relationship 

Part  of  the  motivation  for  the  present  work  was  to  extend  the  examination  of  stochastic 
patterns  in  historical  combat  statistics  beyond  consideration  of  each  side's  casualty  behaviour. 
Equation  14  shows  that  any  arbitrary  function  defined  from  the  system's  stochastic  variables 
should  also  exhibit  stochastic  behaviour.  This  should  then  be  observable  through  examination 
of  the  frequency  distribution  of  that  function's  value  using  historical  data. 

Section  5.2  examined  the  application  of  the  differential  rule  of  Equation  15  to  the  Helmbold 
Equation  (Equation  4).  However,  analytic  examination  of  the  resulting  expressions  have 
yielded  little  of  interest  beyond  the  simple  rule  of  Equation  18.  Certainly,  no  clear  indication 
of  how  the  historical  data  (Figure  2)  should  be  distributed  about  the  mean  value  has  been 
found. 

The  frequency  distribution  of  the  Helmbold  Ratio  in  Figure  2  about  the  line  of  best  fit  can  be 
obtained  by  modifying  the  procedure  used  in  the  previous  section.  Two  different  approaches 
for  segmenting  the  database  were  examined.  The  posture  of  the  winning  side  was  used  in  the 
initial  investigation,  after  which  the  effect  of  the  size  of  the  Force  Ratio  on  the  distribution  was 
considered. 

For  each  data  point  in  Figure  2,  the  Force  Ratio  value  was  used  to  calculate  an  expected 
Helmbold  Ratio  using  the  line  of  best  fit  determined  by  regression  analysis  of  the  relevant 
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database  segment.  The  value  of  the  logarithm  of  the  historical  Helmbold  Ratio  minus  the 
logarithm  of  the  expected  Helmbold  Ratio  is  then  calculated.  The  frequency  distribution  of 
these  delta-ln-Helmbold-Ratio  values  is  then  determined  using  the  procedure  described  in  the 
previous  section.  Summary  statistics  for  the  number  distributions  using  this  database 
segmentation  are  given  below  with  the  distributions  plotted  in  Figure  12. 

Table  5:  Summary  Statistics  by  winner's  postures  of  ln(Helmbold  Ratio  Distribution) 


Posture 

Mean 

Variance 

Standard 

Error 

Kurtosis 

Skewness 

All 

0.00 

1.89 

0.03 

0.86 

0.01 

Attacker 

0.00 

1.25 

0.03 

0.72 

-0.49 

Defender 

0.00 

1.15 

0.05 

2.32 

0.54 

It  is  not  clear  whether  the  wide  range  of  values  for  both  the  kurtosis  and  skewness  is  a  result 
of  the  large  fluctuations  in  the  data  or  is  indicative  of  underlying  differences  in  the 
distributions.  The  analysis  method  is  responsible  for  the  mean  values  of  zero.  A  normal 
distribution  should  have  a  kurtosis  of  3  and  a  skewness  of  0.  The  difference  between  the 
variances  for  attacker  wins  and  defender  wins  is  not  statistically  significant.  The  larger  value 
for  the  variance  observed  for  the  entire  database  results  from  the  offset  between  the  regression 
lines  for  attacker  wins  and  defender  wins. 


Delta  In  (  Helmbold  Ratio) 


Figure  12:  Helmbold  Ratio  number  distribution.  Attacker  victories  are  coloured  red  while  defender 
victories  are  green.  Entire  database  is  coloured  black. 

While  this  data  does  not  allow  the  form  of  the  distribution  to  be  confirmed,  the  observed 
distributions  are  not  inconsistent  with  the  behaviour  expected  from  a  normal  distribution. 
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6.3.1  Segmented  by  Battle  Size 

The  size  of  the  database  limits  the  number  of  segments  into  which  the  data  of  Figure  2  can  be 
split  and  still  allow  the  distribution  to  be  determined.  Each  data  point  in  Figure  2  only 
contributes  one  value  of  the  Helmbold  Ratio,  in  contrast  to  two  casualty  values.  Study  of  the 
dependence  of  the  distribution  of  the  Helmbold  Ratio  with  battle  size  had  to  ignore  the 
winner's  posture.  The  number  of  battles  is  not  sufficient  to  allow  division  into  quartiles  of 
sufficient  size  to  facilitate  frequency  analysis  with  reasonable  accuracy.  The  maximum 
number  of  size  divisions  that  could  be  used  was  three.  Even  then,  the  fluctuations  across  the 
distribution  was  larger  than  desired.  The  dependence  of  the  Helmbold  Ratio  on  the  Force 
Ratio  suggests  that  the  magnitude  of  the  Force  Ratio  should  be  used  to  quantify  battle  size. 
Summary  statistics  for  this  segmentation  are  given  in  Table  6  and  the  distributions  are  plotted 
in  Figure  13. 


Battle  Size 

In  Force-Ratio 
Minimum  Maximum 

Mean 

Lower 

-3.20 

0.00 

0.00 

Middle 

0.01 

0.71 

0.03 

Upper 

0.72 

3.66 

-0.04 

Skewness 
-0.09 
0.02 

2.09  0.05  0.10  0.13 


Table  6:  Summary  Statistics  by  battle  size  of  ln(Helmbold  Ratio  Distribution) 

delta  In-Helmbold-Ratio 
Variance  Standard  Error  Kurtosis 
1.90  0.03  0.59 

1.73  0.03  1.96 


Figure  13:  Helmbold  Ratio  number  distribution  dependence  on  Force  Ratio  magnitude.  Lower  third 
are  coloured  blue,  middle  third  are  coloured  black  and  upper  third  are  coloured  purple. 
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The  skewness  and  kurtosis  show  no  definite  trend  with  battle  size  as  determined  by  the  Force 
Ratio.  The  observed  differences  in  the  distributions  are  not  statistically  significant.  The 
distribution  of  values  for  the  Flelmbold  Ratio  does  not  appear  to  depend  on  the  Force  Ratio  in 
this  analysis.  This  data  does  not  allow  the  form  of  the  distribution  to  be  confirmed.  Flowever, 
the  observed  distributions  are  not  inconsistent  with  the  behaviour  expected  from  a  normal 
distribution. 


7.  Predictions  of  Battle  Outcomes 


Lanchester  models  of  combat  are  generally  considered  to  be  capable  of  providing  insight 
into  a  limited  number  of  questions.  The  most  important  of  which  is  'What  conditions  are 
required  for  success?'  (Who  will  win?)  Helmbold  [4]  examined  the  correlation  between 
the  value  of  the  Advantage  parameter  V  and  the  battle's  outcome: 


V  =  In  //  =  2  In 

v 


l-{x/xoy  ' 

l~(y/y0)\ 


(19) 


where  x  denotes  the  attacker  and  y  the  defender  in  each  battle.  The  advantage  parameter  has 
been  used  in  most  subsequent  work,  including  Hartley  [5].  This  simple  relationship,  using 
only  initial  and  final  strengths,  does  not  include  non-attrition  considerations.  It  is  a  stochastic 
measure  of  success  where  a  negative  value  predicts  the  attacker  will  be  successful  and  a 
positive  value  predicts  a  defender's  success.  For  this  reason  it  is  generally  known  as  the 
Defender's  Advantage  V.  To  date,  no  studies  to  determine  whether  there  is  any  dependence  of 
the  probability  that  V  successfully  predicts  the  outcome  on  other  factors,  such  as  the  battle's 
size,  appear  to  have  been  undertaken. 

Examination  of  the  effect  of  battle  size  on  the  probability  that  V  successfully  predicts  the 
outcome  has  proven  difficult.  As  already  discussed,  battle  size  is  difficult  to  quantify  as  it  can 
be  measured  in  a  number  of  ways.  Most  workers  define  the  size  of  a  battle  as  the  total  of  all 
forces  involved  in  the  battle  (both  sides).  The  present  work  has  found  trends  when  battle  size 
is  determined  by  the  strength  of  specific  single  sides.  However,  consideration  of  any 
dependence  of  the  probability  that  V  successfully  predicts  the  outcome  on  battle  size  has  been 
deferred  to  subsequent  work. 

The  present  work  was  only  able  to  examine  the  historical  database  for  any  dependence  on  the 
winner's  posture  and  battle  date.  The  value  of  the  Defender's  Advantage  and  known  battle 
outcome  were  used  to  determine  the  probability  that  the  outcome  was  successfully  predicted 
for  each  of  the  epochs  (section  3.5)  used  in  this  database.  This  probability  was  determined  for 
each  epoch  as  a  whole  and  also  segmented  by  the  posture  of  the  winning  side  for  each  epoch. 
These  results  are  listed  in  the  following  table.  A  date  was  assigned  to  the  epoch  by  averaging 
the  date  for  all  battles  constituting  that  sub-division. 
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Table  7:  Probability  that  Defender's  Advantage  predicted  outcome  successfully 


Dataset 

Representative 

Probability 

Probability 

Probability 

Year 

Defender 

Attacker 

All 

successfully 

successfully 

successfully 

predicted 

predicted 

predicted 

Ancient 

900 

0.95 

0.88 

0.90 

17th  Century 

1650 

0.67 

0.94 

0.85 

18th  Century 

1745 

0.71 

0.91 

0.83 

Revolution 

1795 

0.90 

0.91 

0.91 

Empire 

1810 

0.74 

0.83 

0.80 

ACW 

1860 

0.84 

0.80 

0.81 

19th  Century 

1865 

0.67 

0.85 

0.78 

WWI 

1915 

0.74 

0.87 

0.82 

WWII 

1945 

0.56 

0.81 

0.74 

Post  WWII 

1980 

0.62 

0.80 

0.77 

These  results  confirm  that  initial  strengths  and  casualties  are  significant,  if  not  dominant, 
determinants  for  successfully  predicting  battle  outcomes.  The  results  are  also  shown  in  Figure 
14.  It  is  interesting  to  note  that  the  Defender's  Advantage  appears  better  at  predicting  overall 
attacker  successes  (0.85)  than  defender  successes  (0.73).  This  difference  is  statistically 
significant  at  the  95%  confidence  level. 


Figure  14:  Probability  that  the  Defender's  Advantage  correctly  predicts  the  outcome  segmented  by 
epoch  and  posture.  Attacker  victories  are  coloured  red  while  defender  victories  are  green. 
Non-segmented  data  is  coloured  blue. 
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Figure  14  also  includes  straight  line  regression  fits  to  the  data.  Little  variation  in  the  success 
rate  of  the  advantage  parameter  with  date,  low  values  of  the  coefficient  of  determination, 
together  with  the  lack  of  a  systematic  trend  in  the  success  rate  of  the  advantage  parameter  is 
consistent  with  the  independence  of  this  parameter  from  date.  This  also  implies  that  it  does 
not  depend  on  other  quantities  that  correlate  with  date,  such  as  the  technology  used  in  battle. 
There  is  one  exception  to  this  observation. 

The  behaviour  of  the  probability  of  the  advantage  parameter  to  successfully  predict  the 
battle's  outcome  for  all  three  cases  considered  is  consistent  up  to  around  the  year  1900.  After 
that  date  the  observed  behaviour  for  attacker  victories  and  for  all  battles  remain  consistent 
with  each  other  and  with  their  previous  trend  (consistent  with  no  dependence  on  date).  The 
behaviour  for  success  in  predicting  defender  victories,  however,  shows  a  noticeable  change 
with  the  advantage  parameter  becoming  much  less  successful  at  correctly  predicting  a 
defender  victory.  The  change  in  the  gradient  of  the  lines  of  best  fit  between  these  two  periods 
(pre-1900  and  post-1900)  is  statistically  significant,  indicating  that  the  observed  reduction 
represents  a  real  change  in  the  contribution  of  attrition  to  the  defender's  success.  At  present 
no  explanation  for  this  behaviour  has  been  found. 

7.1  Combat  Entropy 

In  the  early  1990s  the  use  of  Shannon  entropy  (H>)  to  measure  the  disorder  introduced  as  a 
result  of  combat  attrition,  where  c  is  the  number  of  casualties  at  time  t  and  N  the  force  strength 
at  the  same  time  was  applied  to  combat  [35]: 


c(0_in_*0 


N(t)  N(t ) 


(20) 


It  was  proposed  that  the  difference  in  entropy  between  both  sides  would  represent  some 
measure  of  the  outcome  of  the  battle.  This  has  subsequently  been  examined  by  a  number  of 
workers  including  Dexter  [36],  who  found  a  good  correlation  between  the  entropy  difference 
and  battle  outcome.  The  database  used  in  that  study  was  small  (around  100  entries)  and  the 
criteria  used  for  its  construction  are  not  clear,  leaving  unanswered  questions  about  the  impact 
of  bias.  Although  better  agreement  was  found  using  entropy  than  Lanchester  predictions  of 
victory  (Defender's  Advantage),  the  possibility  of  bias  could  not  be  excluded. 

The  ability  of  the  entropy  difference  to  correctly  predict  the  outcome  of  a  battle  was  also 
examined  in  the  present  work  using  the  same  methodology  as  for  Defender's  Advantage.  The 
results  also  show  a  correlation  between  entropy  difference  and  combat  outcome,  with  the 
same  trends  as  reported  above  using  the  Defender's  Advantage  parameter.  For  this  reason, 
the  results  of  the  combat  entropy  study  have  not  been  reproduced.  However,  the  present 
work  found  that  combat-entropy  was  less  effective  in  correctly  predicting  the  outcome  than 
the  Defender's  Advantage,  in  contrast  to  Dexter's  conclusion  [36]. 
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7.2  Examination  of  the  Corollary  to  Helmbold's  Equation 

The  application  of  Ito's  method  to  examine  the  differential  of  the  Helmbold  Ratio  in  section 

5.2  produced  the  relationship  of  Equation  17,  where  the  ratio  on  the  left-hand-side  (hereafter 
called  the  End  Ratio)  should  depend  only  on  the  initial  Force  Ratio.  The  End  Ratio  depends 
only  on  the  initial  and  final  force  values  and  has  similarities  to  the  force  dependent  parts  of 
the  Defenders  Advantage  parameter  of  Equation  19.  All  of  which  suggests  that  the  End  Ratio 
may  also  be  of  use  to  distinguish  between  battles  won  by  the  attacker  from  battles  won  by  the 
defender. 

The  database  developed  for  the  present  work  can  easily  be  used  to  examine  this  relationship, 
which  is  shown  in  Figure  15  using  logarithmic  scales  along  with  the  regression  coefficient 
obtained  from  a  least  squares  fit  to  each  of  the  data  segments.  As  usual,  battles  with  attacker 
victories  are  coloured  red  while  defender  victories  are  coloured  green. 


Figure  15:  Relationship  between  the  End  Ratio  and  initial  Force  Ratio.  Attacker  victories  are  coloured 
red  while  defender  victories  are  green. 

Defender  victories  are  seen  to  lie  predominantly  in  the  top  half  of  Figure  15,  with  positive 
values  for  the  logarithm  of  the  End  ratio,  while  attacker  victories  predominantly  have 
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negative  values  for  this  parameter.  The  low  values  of  the  correlation  coefficients  obtained  for 
each  data  segment,  and  shown  in  Figure  15,  is  consistent  with  no  dependence  of  the  End  Ratio 
on  the  initial  Force  Ratio.  The  logarithm  of  the  End  Ratio  does  appear  to  differentiate  between 
battles  won  by  the  attacker  and  battles  won  by  the  defender,  with  the  value  zero 
(corresponding  to  the  End  Ratio  being  unity)  being  the  dividing  line.  The  independence  of  the 
End  Ratio  from  the  Force  Ratio  is  also  consistent  with  the  stochastic  processes  acting  evenly 
on  both  sides  of  a  battle. 


8.  Battle  as  a  Complex  Adaptive  System 

Lanchester's  attrition  equations  describe  the  evolution  of  a  single  battle  in  time.  However, 
analysis  of  historical  data  has  shown  that  many  of  the  relationships  derived  from  Lanchester's 
Equations  also  describe  the  behaviour  of  the  same  parameters  from  such  a  collection  of 
unrelated  battles  [4],  [5],  [6],  [7],  Helmbold's  pioneering  work  [4]  made  the  assumption  that 
the  attrition  coefficients  were  approximately  the  same  for  all  battles.  Hartley  [5]  sought  to 
relax  this  assumption  and  has  examined  this  issue  at  length.  Neither  has  proposed  a  possible 
mechanism  for  why  this  behaviour  is  observed.  Section  3.3  deferred  consideration  of  that 
issue  to  here.  Consideration  of  why  this  might  be  the  case,  albeit  following  an  empirical  rather 
than  mathematically  rigorous  approach,  requires  a  brief  review  of  those  Complex  Adaptive 
Systems  concepts  necessary  for  an  understanding  of  scale  free  behaviour. 

The  Lanchester  Equations  are  a  model  for  the  behaviour  of  two  interacting  populations.  Such 
dynamical  systems  (two  interacting  populations)  are  of  considerable  interest  and  have  been 
widely  studied.  Another  such  system  is  the  Lotka-Volterra  model  of  predator-prey  interacting 
populations,  which  has  also  been  examined  from  a  stochastic  viewpoint  [37].  This  is  of  interest 
for  the  present  work  as  it  is  widely  believed  to  be  an  analogue  for  the  Lanchester  Equations. 

The  state  of  such  systems  is  typically  described  by  its  location  in  phase  space.  The  phase  space 
coordinates  in  this  case  being  restricted  to  the  magnitudes  of  the  populations,  which  are 
sufficient  to  describe  the  system's  time  dependence  without  the  explicit  inclusion  of  the  rate  of 
change  as  an  additional  coordinate.  A  defining  characteristic  of  the  Lotka-Volterra  Equations 
is  that  their  solutions  encompass  periodic  behaviour.  Put  in  other  words,  their  phase  space 
trajectories  are  closed  curves,  indicating  that  they  describe  conservative  stable  systems,  (the 
same  point  in  phase  space  can  be  revisited)  This  represents  a  significant  difference  to  a  system 
described  by  the  Lanchester  Equations.  Such  systems  are  fundamentally  dissipative,  their 
phase  space  trajectories  have  a  point  attractor  with  both  force  strengths  equal  to  zero.  The 
most  useful  properties  of  the  Lotka-Volterra  Equations  therefore  cannot  be  applied  to  the 
Lanchester  Equations.  This  highlights  the  importance  in  understanding  the  general  properties 
of  dissipative  systems  to  provide  better  insight  into  the  dynamics  of  combat.  Dissipative 
systems  differ  from  conservative  systems  in  a  number  of  critical  ways,  the  most  important  of 
which  results  from  the  reversible  nature  (at  least  in  principle)  of  a  conservative  system 
compared  with  the  fundamentally  irreversible  nature  of  a  dissipative  system. 

Reversible  systems  are  typically  closed  or  isolated  from  external  influence,  their  evolution  is 
described  by  the  current  value  of  their  endogenous  variables  whose  past  history  is  irrelevant. 
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They  can  be  treated  as  if  they  are  in  equilibrium,  or  at  least  can  be  described  as  quasi-static 
systems,  in  which  the  system  follows  a  succession  of  equilibrium  states  and  transitions 
between  the  states  sufficiently  slowly  such  that  at  each  moment  of  time  the  system  can  be 
treated  as  if  it  were  in  equilibrium.  This  model  of  system  dynamics  has  been  widely  applied 
since  the  time  of  Newton,  and  with  considerable  success  [38].  However,  during  the  20th 
century  there  has  been  an  increasing  acceptance  that  a  whole  class  of  problems  has  been 
largely  ignored,  because  they  are  not  suited  to  examination  by  the  techniques  developed  in 
the  study  of  reversible  systems.  Many  dissipative  systems  are  poorly  described  using 
techniques  developed  for  reversible  systems.  Just  as  classical  thermodynamics  was  used  as  a 
model  for  many  reversible  systems  [38],  the  recent  (and  ongoing)  development  of  irreversible 
thermodynamics  [39]  is  beginning  to  be  seen  as  a  starting  place  for  more  appropriate  studies 
of  other  irreversible  systems  [40]. 

Irreversible  systems,  in  contrast,  are  typically  open  or  interacting  with  their  environment, 
their  evolution  also  requires  knowledge  of  appropriate  exogenous  variables.  The  development 
of  stochastic  forms  of  Lanchester's  Equations  being  one  approach  to  the  inclusion  of  a  model 
for  that  interaction  into  the  theory  of  attrition.  The  past  history  of  the  system  can  be  important 
for  understanding  its  future  development.  If  a  system  is  reversible,  its  entropy  does  not 
change.  Indeed,  this  can  be  used  as  a  definition  of  a  reversible  process.  Classical 
thermodynamics  approximates  real  systems  as  the  sum  of  closed  quasi-static  reversible  and 
irreversible  parts.  The  entropy  of  the  irreversible  part  increases  in  accordance  with  the  second 
law  as  the  system  evolves  towards  its  equilibrium  end-state,  which  consequently  can  be  seen 
as  the  state  with  maximum  entropy.  This  is  a  good  description  of  the  behaviour  of  closed 
systems.  Many  real  systems  are  open  in  addition  to  being  irreversible.  The  1977  Nobel  Prize 
for  Chemistry  was  awarded  in  large  part  for  demonstrating  that  such  systems  do  not  evolve  to 
maximum  entropy  and  equilibrium  states  [38] .  Instead,  they  evolve  to  states  for  which  the 
production  of  entropy  is  minimised,  at  least  in  the  linear  thermodynamics  regime.  This 
stationary  state  is  maintained  through  its  dynamic  interaction  with  the  environment,  and 
allows  the  system  to  exist  in  a  more  structured  condition  than  would  be  permitted  at 
equilibrium. 

Equilibrium  is  the  condition  of  maximum  entropy,  which  is  also  the  state  of  minimum 
information  content.  The  ensemble  of  components  forming  the  system  have  their  individual 
states  distributed  according  to  the  Boltzmann  distribution  [40].  In  the  large  number  (size) 
limit,  this  can  be  replaced  by  the  Poisson  (discrete)  or  Gaussian  (continuous)  probability 
distributions  resulting  from  the  corresponding  random  processes  [41], 

These  probability  distributions  exhibit  a  flat  spectral  density5  (white  noise)  for  variations 
about  their  mean  values,  which  is  also  regarded  as  characteristic  of  random  processes. 

S(  v)  =  constant  (21) 

This  is  the  case  for  closed  or  isolated  systems  at  equilibrium  or  quasi-static  equilibrium.  As 
demonstrated  above,  open  or  interacting  systems  are  not  subject  to  the  same  constraints  as 


5  White  noise  is  a  random  signal  and  its  frequency  spectrum  is  determined  from  the  Fourier  Transform  of  its 
probability  distribution.  In  this  case  a  constant. 
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equilibrium  systems  and  may  consequently  exhibit  different  spectral  densities  in  their 
behaviour.  This  is  usually  interpreted  as  equivalent  to  meaning  non-equilibrium  systems  are 
not  described  by  random  behaviour.  While  this  is  a  sufficient  condition,  it  is  not  a  necessary 
condition. 

The  random  processes  underlying  equilibrium  behaviour  all  employ  one  key  assumption, 
events  occur  independently  of  one  another.  When  this  does  not  hold,  and  there  is  partial 
correlation  between  events,  the  system  contains  additional  information  describing  that 
correlation  and  white  noise  is  no  longer  a  description  of  the  spectral  density. 

However,  this  behaviour  can  still  arise  from  a  random  process.  Consider  a  fishing  fleet  of  N 
boats  where  the  captain  of  each  boat  makes  a  random  choice  whether  to  fish  on  any  given  day 
independently  of  the  decisions  of  the  other  captains.  This  leads  to  a  simple  Poisson 
(equilibrium)  distribution  for  the  expected  number  of  boats  observed  fishing  per  day.  If  the 
effect  of  weather  (an  exogenous  variable)  on  the  expected  return  for  a  day's  fishing  is  then 
added  to  this  model,  the  probability  that  a  boat  will  fish  on  a  poor  day  will  become  less  than 
the  probability  that  it  will  fish  on  a  good  day.  Each  captain  still  makes  the  random  decision  to 
fish  or  not  independently  of  the  others,  but  the  decisions  are  now  correlated  through  the 
action  of  the  weather  in  biasing  the  probability.  This  leads  to  a  non-Poisson  distribution  of 
observed  fishing  boats  and  a  spectral  distribution  that  is  frequency  dependent,  where  the 
additional  information  content  describes  the  pattern  of  the  weather.  This  type  of  behaviour  is 
commonly  known  as  Self  Organised  Criticality. 

Many  such  systems  are  known  in  nature  and  exhibit  a  characteristic  frequency  dependent 
spectral  distribution  commonly  known  as  Pink  noise  [40,  42]: 

S(y)  oc  -1  (22) 

v 

where  the  exponent  is  typically  a  real  number  0  <  s  <  2.  This  is  equivalent  to  a  frequency 
dependent  probability  for  the  occurrence  of  system  events.  This  type  of  system  is  known  as 
scale  free,  because  regardless  of  the  size  (scale)  of  the  spectrum  under  investigation  the  same 
behaviour  is  observed  (a  small  piece  of  the  spectrum  examined  in  detail  or  a  much  larger 
section  examined  more  coarsely).  The  same  phenomena  producing  the  microscopic  behaviour 
is  also  responsible  for  the  macroscopic  behaviour.  Scale  free  behaviour  has  been  observed  in 
many  different  dynamical  systems,  ranging  from  the  frequency  of  earthquakes  [30]  to  traffic 
accidents  [43]. 

A  number  of  descriptions  for  the  emergence  of  self-organised  criticality  have  been  proposed, 
associated  with  the  observation  of  pink  noise.  They  are  frequently  used  as  analogues  for  other 
complex  systems  exhibiting  scale  free  behaviour.  In  systems  that  are  not  static  but  evolving, 
the  Red  Queen  Principle  (or  co-evolution)  has  also  been  suggested  as  a  mechanism  behind 
self-organised  criticality  [44].  The  Red  Queen  Principle  [45]  is  based  on  the  observation  to 
Alice  by  the  Red  Queen  in  Lewis  Carroll's  " Through  the  Looking  Glass"  that  "in  this  place  it  takes 
all  the  running  you  can  do,  to  keep  in  the  same  place".  If  a  number  of  dynamic  systems  coexist,  the 
random  variations  introduced  by  evolution  in  one  system  can  produce  a  competitive 
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advantage  (increased  fitness)  for  that  system,  and  thus  be  able  to  capture  a  larger  share  of  the 
resources  available  to  all.  This  means  that  a  fitness  increase  in  one  evolutionary  system  will 
tend  to  lead  to  a  fitness  decrease  in  another  system.  The  only  way  that  a  system  involved  in  a 
competition  can  maintain  its  fitness  relative  to  the  others  is  by  in  turn  improving  its  design.  In 
an  evolutionary  system,  continuing  development  is  needed  in  order  to  maintain  its  fitness 
relative  to  the  systems  it  is  co-evolving  with. 

A  possible  explanation  of  why  the  pattern  of  behaviour  observed  from  an  ensemble  of 
different  battles  (Figure  1  and  Figure  2)  can  be  described  using  the  behaviour  expected  within 
a  battle  (Equation  4)  can  now  be  attempted.  This  is  an  observation  of  scale  free  behaviour  in 
which  the  same  pattern  of  behaviour  is  observed  at  different  scales  of  a  phenomena.  Stochastic 
forms  of  Lanchester's  Equations  include  a  model  for  the  effect  of  the  wider  environment 
(exogenous  variables)  on  attrition  [7],  Exogenous  variables  imply  that  combat  is  an  open 
system  which  allows  non-equilibrium  states  to  be  stable.  An  evolving  non-equilibrium  system 
can  result  in  the  co-evolution  of  system  variables  and  the  Red  Queen  effect.  Such  variables 
will  exhibit  a  power  law  (pink  noise)  relationship.  The  cause  of  scale  free  behaviour  in 
attrition  can  then  be  understood  as  a  consequence  of  the  Red  Queen  effect.  All  that  remains  is 
to  identify  quantities  involved  in  the  attrition  process  that  should  exhibit  co-evolution  and  do 
show  a  scale  free  (power  law)  relationship. 

Dupuy  [13]  suggests  an  obvious  co-evolution  between  weapon  lethality  and  battlefield 
dispersion,  although  the  data  in  that  publication  is  not  particularly  suitable  for  quantitative 
study.  The  co-evolution  can  be  explained  using  the  following  sequence  of  events.  Side  1 
introduces  ( evolution )  a  more  effective  weapon  or  tactics,  increasing  the  attrition  coefficient  of 
side  2.  This  changes  the  value  of  the  values  for  the  coefficients  or  and  in  Equation  4  moving 
the  battle  data  away  from  the  line-of -best-fit.  Side  2  can  choose  to  respond  ( co-evolution )  to  this 
change  by  spreading  its  forces  out.  This  reduces  the  number  of  targets  available  to  side  1  and 
their  rate  of  loss.  The  effective  value  of  the  attrition  coefficient  is  also  reduced  which  counters 
the  previous  changes  in  or  and  /3 ,  restoring  the  status-quo.  The  data  from  an  ensemble  of 
battles  should  then  cluster  about  the  line  representing  the  evolution  of  a  single  battle  with 
values  of  or  and  / 3  describing  an  "average"  battle. 

Later  work  by  Dupuy  does  include  a  limited  amount  of  data  to  permit  a  preliminary 
investigation.  Dispersion  can  be  easily  measured  in  terms  of  the  number  of  troops  per  square 
kilometre.  Lethality  is  more  difficult  to  define,  let  alone  measure.  For  an  explanation  of  how 
lethality  is  defined  and  measured  the  interested  reader  is  referred  to  Dupuy' s  work  [46].  The 
dispersion  and  lethality  results  are  shown  on  logarithmic  axes  in  Figure  18  along  with  a  least 
squares  regression  to  the  data. 
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Figure  17:  ln(Troop  Density)  as  a  Function  of  In  (Lethality) 


A  power  law  relationship  between  lethality  and  dispersion  is  consistent  with  these  results. 
This  supports  the  contention  that  co-evolution  is  at  work  between  quantities  involved  in 
attrition  which  acts  to  oppose  any  tendency  for  battle  results  to  deviate  consistently  from  the 
average.  Scale  free  behaviour  of  some  parameters  should  be  expected,  and  is  a  possible 
explanation  for  the  observed  agreement  in  the  behaviour  of  the  historical  ensemble  of  battles 
with  the  expectation  of  a  single  battle. 


9.  Conclusions 


The  present  work  has  used  two  different  approaches  in  its  examination  of  the  behaviour 
expected  from  a  battle  where  attrition  is  described  by  Lanchester's  Equations.  It  is  important 
to  remember  that  Lanchester's  Equations  are  not  a  model  of  combat,  only  a  model  for  combat 
attrition.  The  equations  alone,  therefore,  cannot  be  expected  to  capture  other  effects  such  as 
the  movement  of  engaged  forces.  Both  methodologies  can  be  regarded  as  empirical,  rather 
than  mathematically  rigorous. 

The  present  work,  while  not  mathematically  rigorous,  has  applied  a  standard  stochastic 
calculus  method  to  find  the  equation  that  specifies  the  evolution  of  an  arbitrary  function  in  a 
system  with  two  stochastic  variables.  Similar  to  the  approach  of  Black  and  Scholes,  this 
differential  form  was  applied  to  a  number  of  functions  known  to  be  of  interest  for  the 
Lanchester  Square  Law  system  of  equations.  Application  of  this  to  Helmbold's  Ratio  lead  to 
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the  discovery  of  a  new  construct,  defined  using  both  sides  initial  and  final  force  strengths,  that 
has  proven  able  to  differentiate  between  battles  won  the  attacker  and  battles  won  by  the 
defender. 

The  second  approach,  which  constituted  the  bulk  of  the  present  work,  compared  historical 
battle  data  to  the  behaviour  expected  from  a  battle  where  attrition  is  described  by  Lanchester's 
Equations.  This  required  a  comprehensive  examination  of  the  issues,  problems  and  constraints 
on  using  historical  data  for  any  form  of  analysis.  It  is  an  area  often  neglected  by  such  studies. 

All  battle  compilations  are  the  product  of  the  recursive  application  of  a  data  sampling  process. 
The  population  consists  of  all  battles.  This  is  first  sampled  to  produce  the  set  of  all  recorded 
battles.  Many,  especially  smaller  engagements,  are  never  recorded.  The  requirement  that  both 
the  initial  and  final  values  of  forces  strengths  are  known  produces  another  sub-sampling  stage 
to  generate  the  set  of  all  recorded  battles  with  usable  data.  This  sampling  process  also 
discriminates  against  smaller  battles.  Larger  battles  receive  more  attention  and  hence  are  more 
likely  to  have  their  attributes  recorded.  All  battle  databases  are  themselves  samples  of  that 
sample.  Even  if  the  final  sampling  process  was  random,  the  process  of  recording  history 
generates  an  intrinsic  bias  towards  larger  battles.  This  bias  cannot  be  eliminated  and  any 
analysis  technique  must  include  procedures  for  identifying  and  dealing  with  that  bias.  A 
method  for  the  identification  of  the  effect  of  bias  was  examined.  If  analysis  of  the  data  gives 
the  same  results,  both  before  and  after  the  effects  of  bias  in  the  data  has  been  addressed,  the 
result  can  be  considered  as  insensitive  to  the  effect  of  bias  and  indicative  of  actual  behaviour 
in  recorded  history. 

The  present  work,  in  considering  a  battle  as  the  interaction  between  combat  and  non-combat 
quantities  in  a  complex  adaptive  system  has  also  shown  how  self-organised  criticality  can 
produce  scale  free  behaviour  (same  pattern  of  behaviour  at  different  scales  of  examination  in 
the  data).  This  is  believed  to  be  an  explanation  for  the  observation  that  the  behaviour  of  the 
results  from  an  ensemble  of  different  battles  can  be  described  using  that  expected  from  the 
evolution  of  a  single  battle. 

The  results  of  this  study  of  the  behaviour  of  the  initial  and  final  strengths  for  both  sides  from 
an  ensemble  of  historical  battles  is  consistent  with  the  expectations  of  systems  using  the 
stochastic  Lanchester  Equations.  Segmentation  of  the  data  base  according  to  a  number  of 
parameters  including  the  force's  initial  strength  and  the  posture  of  the  winning  side  also  yield 
results  consistent  with  the  stochastic  Lanchester  Equations.  Importantly,  the  comparisons 
indicate  that  the  stochastic  parts  of  the  attrition  processes  act  evenly  on  both  sides  of  the 
battle,  regardless  of  how  the  database  was  segmented. 

The  missing  element  of  combat  theory  that  rigorously  links  all  the  studies  in  the  present  work 
together  is  a  theory  of  combat  termination.  No  general  or  accepted  theory  for  combat 
termination  has  to  date  been  developed  that  agrees  with  the  available  historical  results.  This  is 
clearly  the  major  remaining  problem  in  the  development  of  a  quantitative  model  of  combat. 
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Appendix  A:  Application  of  Ito's  Rule  to  Helmbold's 

Relationship 


Let: 


f(x,y,t)=  In 


f  2 


2  A 


2  2 
{y<>-y  J 


=  a  In 


^  +  0 
yy  o) 
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Given  that: 
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The  term  in  the  first  bracket  (the  dt  term)  can  be  found  by  substituting  the  differentials: 
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The  first  bracket  in  Equation  A3  can  be  re-expressed  as: 
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f  2  2 
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o  V 


t  -  y0 


(A4) 


However,  from  the  equation  of  state  (main  text  Equation  2)  which  is  consistent  with  the 
modified  form  of  the  equation  of  state  in  Equation  Al  (see  the  discussion  prior  to  Equation  4 
in  the  main  text)  shows  that: 


2  2 
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Hence  the  first  bracket  of  Equation  A3  is  zero.  Therefore: 
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Equation  A5  can  also  be  used  to  replace  the  x  -xo  term  in  Equation  A6,  leaving: 
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From  which  some  elementary  cross-multiplication  produces  the  desired  result:  of  Equation  17 
from  the  main  text. 
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